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ABSTRACT 

This report is an integral part of the publication 
series of the Youth in Transition study, a nationwide panel survey of 
adolescent boys, which attempts to discover and document how the 
contemporary social environments affect the development of young men 
during their high school years. Four waves of data were gathered from 
2,213 boys comprising the sample, who were clustered into 87 
different high schools throughout the country. Additionally, because 
of the special interest in the school environment, data were 
collected from the principals, counselors, and samples of teachers in 
the participating schools. The efforts reported in this study are 
based on the attempt to outline a practical procedure to be used in 
longitudinal analyses of Youth in Transition data, the major aim 
being to develop a strategy which can be applied to most, if not all, 
of the analyses to be performed. The report develops a ”p' allel 
prediction” model for longitudinal analysis, which makes separate use 
of each repetition of the criterion dimension; it is contended that 
the proposed strategy is wide! y applicable in studies employing panel 
designs. The proposed model was applied to a limited set of analyses 
of the Youth in Transition data. Early identification of subgroups 
was seen to have a facilitating effect in longitudinal analysis, 
(Aut.hor/RJ) 
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PREFACE 



This report is written in partial fulfillment 
of the requirements for the degree of Doctor of 
Philosophy in The University of Michigan. It is 
also an integral part of the publication series of 
the Youth in Transition study, a nationwide panel 
survey of adolescent boys. This study is attempt - 
ing to discover and document how the contemporary 
social environments affect the development of 
young men during their high school years. 

The first major data collection was fielded 
in the fall of 1966. The respondents were chosen 
so as to be representative of sophomore boys in 
public high schools in the United States that fall. 
The 2,213 boys comprising the sample were clustered 
into 87 different high schools throughout the 
country . ^ 

Three additional waves of data have now been 
gathered from this sample of boys. Additionally, 
because of our special interest in the school envi- 
ronment, data were collected from the principals, 
counselors, and (samples of) teachers in each of 

lPor a more detailed description c f the sam- 
pling design, see Bachman, et al. , 1967 , pp» 21— 

24. 
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the participating schools. These "organizational 
data were gathered at about the same time as our 
second wave of data from the boys who were then 
just completing their junior years. The combi- 
nation of four waves of beys * data and the organi- 
zational data provides a rich set of measures for 
longitudinal analyses. 

The efforts reported herein are based on the 
author's attempt to outline a practical procedure 
to be used in longitudinal analyses of the Youth 
in Transition data. The major aim of these efforts 
is to develop a strategy which can be applied to 
most, if not all, of the longitudinal analyses to 
be performed. This attempt to generate a general- 
purpose procedure is in keeping with a need 
expressed in the study's first discussion of its 
analysis designs 

Because of the broad scope of the 
project, and especially because of its 
longitudinal desxgn, the possibilities 
for data analysis are vast. It is there- 
fore essential that we develop systems 
and procedures of analysis that give high 
priority to data integration and that we 
provide strategies for examining many 
substantive questions simultaneously 
(Bachman, et al., 1967, p. 81). 

Another characterisw j of these efforts should 
be noted at this point. A project as large as 
Youth in Transition is best viewed as something 

20f course, some of the original respondents 
had left school by this time. However, we con- 
tinued to include both "dropouts" and "stay ins" in 
subsequent data collections. 
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other than an hypotheses-testing venture. We have 
previously attempted to describe our philosophical 
approach as one of "theoretically-guided empiricism" 
(Bachman, et al. , 1967, p. 17). In doing so, we do 
not mean to denigrate the use of the hypothetical- 
deductive model which underlies much of the 
methodology associated with inferential statistics. 
Rather, we mean to emphasize that we see our major 
contribution primarily in the inductive phase of 
theory development and not in the deductive or 
model- testing phase. 

A colleague has described such efforts as 
attempts to "find variables that work" (Sonquist, 
1969, pp. 83-95). He describes the major problem 
as "...one of determining which of the variables 
for which data have been collected are actually 
related to the phenomenon in question, and under 
wh? t conditions and through which intervening 
processes, with appropriate controls for spuri- 
ousness" (Sonquist, 1970, p. 1). Thus, we will 
make frequent use of techniques appropriate to 
this "finding variables that work" mission. In 
doing so, our approach may be described as an 
effort in "discovering grounded theory" (Glaser 
and Strauss, pp. 1-18) . 

Finally, because of the project's major 
emphasis on exploring the impact of the school 
environment, the analyses to be reported will 
focus on the 1,374 young men who remained in the 
same high school during the first three data 
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collections. Concentrating these analyses on this 
non-moving, non-dropout subset should not be taken 
as an indication of disinterest in the other young 
men in the sample. Quite the contrary; specific 
analysis plans have already been made to examine 
other subsets of the data, and their results will 
be reported in forthcoming monographs. However, 
we do not want to confound our initial analyses of 
environmental effects by including boys who were 
not exposed to the same school environment through- 
out their high school years. 
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Chapter 1 

INTRODUCTION TO THE 
IMPORTANT ISSUES 



One of the important objectives of the Youth 
in Transition study is to examine some of the major 
changes taking place in adolescent boys during the 
high school years. Consequently, measures of many 
important dimensions were taken at each of the four 
data collections shown in Figure 1—1 • These 
repeated dimensions may be placed into seven 
classes: motives, affective states, self-concept, 

values, attitudes, plans, and behaviors. There 
are approximately 45 such dimensions which have 
been included in all four waves of data. 

Now one of the prime areas of analysi t is 
focused on attempting to explain how the immediate 
social environment affects the motives, values, 
plans, etc., of adolescent boys. Thus, we wish to 
consider to what extent these repeatedly measured 
dimensions may be predicted from characteristics 
of his home and family background, school and peer 
group environment, and job environment. This 

iFor a more comprehensive description of the 
variables included in the study, see Bachman, 
et al., 1967, Chapter 4. 

2 The predictability of the initial measures 
is summarized in Bachman, 1970. 
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FIGURE 1-1 

THE YOUTH IN TRANSITION STUDY 
OVERVIEW OF RESEARCH DESIGN 



Data from boys: 



TIME 1 • Fall, 1966 

(early tenth grade) 
JIT tests, interviews, 
ques tionnaires 
N»2213 

97% original sample 



TIME 2 — Spring, 1968 
(late eleventh grade) 
YIT interviews, 
questionnaires (repeated) 
N»1890 

83% original sample 



TIME 3 — Spring, 1969 
(late twelfth grade) 
YIT repeated measures 
& military plans and 
attitudes 

N=1800 

79% original sample 



TIME 4 — Summer, 1970 
(one year beyond 
graduation) 

YIT new and repeated 
measures & new and 
repeated measures of 
military plans and 
attitudes 
N»1620 

71% original sample 



Data from school personnel: 






SCHOOL 

INFLUENCES 



Perceptions 
of the 
school 
environment : 
questionnaires 
from 

teachers, 
counselors , 
principals 
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brings into the picture several hundred measures of 
these important social environments to be considered 
as potentially important predictors of the repeatedly 
measured criterion dimensions. 

There is one additional complication which 
should also be mentioned. For each repeated 
criterion dimension, we will have four separate 
observations (one at each of the four data waves) 
available, each of considerable interest as a 
criterion variable in its own right. In addition, 
we can examine up to six kinds of changes by deriv- 
ing measures which correspond to the intervals 
between each pair of measures. (See Figure 1-2.) 

In all then, we have potentially ten versions of 
each of the above-mentioned criteria, the four 

static or cross-sectional scores and the six 

3 

dynamic or change scores. 

It is perhaps as obvious to the reader as it 
was to the study staff that a general purpose 
strategy is an absolute necessity when faced with 
the plethora of predictor and criterion variables 
to be analyzed. To put it another way, we have 
neither the time nor the desire to custom-build 
an analysis sequence for each criterion dimension 
(or for each predictor for that matter) . Rather, 
what we seek is to develop a strategy which can be 
used to relate a large set of predictors to each 
of a large set of criteria. Much of what is 

3 As we shall see later, even this is an over- 
simplification because there are several competing 
methods of deriving each of the six change scores. 
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FIGURE 1-2 

SIX POSSIBLE INTERVALS FOR ASSESSING CHANGE 



INTERVAL 
TIME TIME 



Fall 

’66 

Wavg_l 



Spring Spring 
’68 ’69 

Wave 2 Wave 3 

| 



Summer 

*70 

Wave 4^ 



1 to 2 
1 to 3 



, 18 mos . 

j » 



30 mos. 



1 to 4 



44 mos. 



2 to 3 



12 mos. 



2 to 4 



26 mos. 



3 to 4 



14 mos. 
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5 



presented in the following chapters may be accu- 
rately viewed as an attempt to develop such a 
strategy. But before turning to the results of 
these efforts, let us first examine briefly what 
is meant by the term "change." 

What is Meant by "Change"? 

Because there has been much confusion in the 
past about what is meant by "change , " let us begin 
this chapter by attempting to clarify the author's 
use of this term. The most critical distinction 
arises when examining individual differences in 
change as opposed to average changes. As scientists, 
we are interested in what causes changes to occur 
in attitudes, values, aspirations, etc., even if 
some individuals change in one direction and others 
change in the opposite direction. In such a situ- 
ation, we may observe no average change, but it may 
be most interesting to try to discover what vari- 
ables are affecting these "equal and opposite" 
individual changes. 

There are many situations in which a study of 
individual change would be rather fruitless. For 
example, consider a situation in which a set of 
uniform procedures have been designed to bring 
subjects to a specified terminal performance level. 
Now, suppose such procedures are administered to a 
group of subjects who are totally naive initially, 
and that those procedures are utilized until each 
subject attains the specified performance level. 
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In such a situation, one would certainly observe 
an average increase in performance? however, since 
every subject will change an equal amount in the 
performance measure, one would not observe any 
individual differences in their change. 

In this chapter and in the following chapter, 
we will be focusing primary attention on situ- 
ations in which individual differences in changes 
are of interest. In Chapter 4, we will examine 
briefly a method for detecting whether average 
changes or trends are observable within subsets of 
our sample. Let us next turn our attention to 
three questions which guided the evolution of the 
proposed strategy. 

What Part Should "Change Scores" Play in the 
General-Purpose Strategy? 

An examination of this issue immediately 
immerses the investigator in some ticklish 
philosophical questions such as "What do we mean 
by change?," "How do we measure change?," and 

4 

"Must we infer change rather than measure it?" 

From these philosophical issues come procedural- 
statistical considerations involving the use of 
"raw difference" scores vs. various kinds of 
"adjusted gain” scores as measures of change. 
Chapters 2 and 3 will present a summary of the 
author's attempts to wrestle with these questions 

*See especially Harris, 1967? Coleman, 1964b? 
Coleman, 1968? Cronbach & Furby, 1970. 
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and his recommendations vis-a-vis the use of 
change scores in longitudinal analyses of the 
Youth in Transition data. 

A slightly different, but related, set of 
questions arises concerning the various ways in 
which change scores are put to use. For example, 
measures of change may be used to arrange indi- 
viduals along a continuum for subsequent analyses. 
This implies that one or more new dynamic dependent 
variables (i.e., change scores) are being derived 
from a combination of static (i.e., cross- 
sectional) scores (as in Figure 1-2). Alterna- 
tively, a single, overall score might be calculated 
by somehow combining two or more of these change 
scores to summarize the amount and direction of 
an individual's overall shift during the interval 
of interest for each criterion dimension. This 
use of a change score might best be viewed as a 
variable reduction procedure because it results 
in a single dependent measure for each dimension 
rather than the ten measures previously discussed. 
Still another potential use for change scores is 
to aid in the identification of criteria where a 
good deal of change is taking place. 

These three potential uses of change scores 
lead to somewhat different streams of analyses. 
However, they are interrelated to a substantial 
degree. For instance, if a method is found of 
deriving a change score which may be used as a 
dependent variable for subsequent analyses, then 
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perhaps a single "overall change" score may be 
found for each criterion. These overall change 
scores might be used, in turn, to point to focal 
areas for early analyses by showing "where the 
action is." However, if a method for calculating 
change scores is not found, then questions regard- 
ing the other two potential uses of change scores 
may not be as readily addressed. It thus follows 
that the question of what kind of change score 
should be used is a crucial one to be considered 
early in the development _-f a general-purpose 
analytic strategy. 

What Alternatives Exist to "Overall Change" Scores? 

Since our attempt to develop a general-purpose 
strategy for longitudinal analyses leads us first 
to an examination of the role played by change 
scores, it is essential that alternatives to the 
use of such scores be carefully considered. The 
most promising alternative of this sort is what 
might be called "parallel prediction" of the 
static criterion scores. More specifically, this 
procedure calls for predicting from a selected 
set of important individual and environmental 
characteristics to the four static criterion 
measures. By noting whether or not a criterion 
is becoming "more predictable" across time, one 
may be able to infer that overall changes have 
taken place. And by noting which of the predictor?, 
are assuming greater explanatory power, one starts 
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to get an indication of what kinds of variables 
may be influencing the inferred changes. 

The second monograph in the Youth in Transi- 
tion series (Bachman , 1970) may be viewed as a 
partial application of such a strategy. In sum- 
marizing the impact of the family background 
characteristics upon the initially measured 
criterion dimensions, a strategy was developed 
which first selected a limited number of impor- 
tant predictors from the much larger available 
set, and then used this limited set to predict 
separately to each of a set of selected criteria. 
Suppose we now performed parallel predictions from 
this same predictor set to the criterion dimen- 
sions measured at Times two, three, and four. If 
we observed in these analyses that the predict- 
able :y of a criterion remained fairly constant 
over time, and if the relative importance of the 
predictors remained essentially the same, we would 
be inclined to conclude that this set of predictors 
has already established its pattern of influence 
on the criterion in question by the time of the 
initial measurement, and that this pattern is not 
changing during the period of observation. This 
conclusion does not follow neoeaeartly ; it is 
simply the most obvious and parsimonious conclu- 
sion. On the other hand, if a criterion were to 
become more predictable over time, there would be 
ample reason to consider concentrating additional 
efforts on discovering the ways in which certain 
predictors increased in their explanatory power. 
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By use of this kind of parallel prediction to 
the several static criterion scores (Times one, 
two, three, and four) , it thus seems possible to 
identify criterion measures where change is taking 
place and at the same time to note which variables 
might be accounting for this change. 

How Can Data from More Than Two "Waves" Be Used 
Most Efficiently? 

The last of the three guiding questipns used 
in the attempts to develop a general purpose 
strategy arose as the author was reviewing previous 
work in this area. With very few exceptions, the 
overwhelming majority of previous efforts have 
been focused exclusively on problems of analyzing 
data from two-wave studies. It is not very sur- 
prising, therefore, that most of these previous 
efforts have limited utility for our four wave 
panel study. It seems intuitively obvious to the 
author that a general purpose strategy should 
make use of all available data. 

At the practical level, we have six different 
intervals within which change could be examined. 
Should we focus attention on just one of these 
intervals? If so, which interval should be chosen? 

bThe most noteworthy exception is briefly 
summarized in Appendix A (the previously referred 
to Coleman chapter in Blalock and Blalock); but 
as we shall see later, this noteworthy exception 
contains other limiting features insofar as use 
with our data is concerned. 
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If not, how can we develop a general-purpose 
strategy that will apply to the 45 criteria which 
have been measured at four points in time? 

To answer this question it will be helpful 
to present briefly one of the conclusions drawn 
from the analyses performed on the first three 
waves of data. While grappling with the question 
of what (if any) kind of change scores would best 
serve our purpose, it became obvious that the 
'’parallel prediction" strategy outlined above 
worked well for pointing to where change was 
indeed taking place. Specifically, when a cri- 
terion dimension became more (or less) predictable 
across the Time 1 to 3 interval, then we selected 
that criterion for subsequent analyses aimed at 
evaluating various kinds of change scores. It is 
important to realize that this procedure permits 
the selection of criterion dimensions without 
involving many of the rather messy methodological 
complications which enter -he picture whenever 
change scores of any type are used. 

The fact that this procedure involves pre- 
dicting to the repeated static scores indicates 
an additional advantage? these predictive rela- 
tionships are often of considerable interest in 
their own right, whether or not change scores are 
utilized. The predictions to the initial cri- 
terion scores from the boys’ family background 
characteristics and intelligence (Bachman, 1970) 
summarize the important family environment effects 
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which were observed as the study began. Similarly 
predicting to the Time 3 criteria from school chair 
acteristics (after appropriate "control" for indi- 
vidual background factors) will permit summaries 
of the effects of school environments. 

Early application of the parallel prediction 
model thus seemed to make a good deal of sense. 

It serves well the purpose of pointing to areas 
where change may be taking place , and additionally 
produces analyses of considerable interest in 
their own right. Finally, and equally important, 
it makes use of the criterioi data from as many 
waves as are available for analysis. 

Summary 

The number of both predictor and criterion 
measures available for analyses is so large that 
a general-purpose strategy is a necessity. In 
developing such a strategy for longitudinal 
analyses, an early and pivotal decision regards 
the use of change scores. Three potential 
approaches have been outlined, and each will be 
considered in analyses reported in subsequent 
chapters. The utility of various kinds of change 
scores will be evaluated in the context of an 
analytic strategy calling for "parallel prediction 
to the repeated static scores. Such a strategy 
seems to be advantageous both because it points 
to areas where change has taken place and also 
because it makes efficient use of all criterion 
data available. 



Chapter 2 

CHANGE SCORES: 
SOME STRENGTHS 
AND WEAKNESSES 



Before turning to ar. evaluation of the utility 
of change scores in longitudinal analyses of the 
Youth in Transition data in Chapter 3, it will be 
helpful to examine the solutions proposed by pre- 
vious investigators in the area of measurement and 
analysis of change.^" Although no single procedure 
has emerged from these previous efforts , there are 
some areas in which the authors seem to be essen- 
tially in agreement. Let us begin by examining 
these areas. 

Areas of Agreement 

Virtually all of the previous investigators 

2 

agree that raw change or raw gain scores are of 
questionable utility and can easily lead to fal- 
lacious conclusions. One reason for this limited 
utility derives from the commonly-observed negative 

ISee especially Harris, 1967; Coleman, 1968? 
and Cronbach & Furby, 1970. 

2 

"Raw change" or "raw gain" is used to denote 
a derived score formed by simple subtraction of an 
earlier static score from the same measure obtained 
at a later point in time. 
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correlation between the initial static score and 
the raw gain score. (Parenthetically, we might 
note that in a parallel fashion, one would also 
observe a positive correlation between the final 
static score and the raw gain.) In the usual 
case in which the variances of the initial and 
final static scores are approximately equal, this 
negative correlation will be observed regardless 
of the sign of the relationship between the initial 
and final scores. (See Appendix B for a. proof of 
this statement.) As a consequence of this negative 
relationship, other variables which are positively 
related to the initial score more than to the final 
score are also likely to show negative relation- 
ships with the raw gains. However, it is by no 
means clear that these other variables affected a 
"real" loss (negative change) in the criterion 
across the observed interval. 

One of the problems with raw gain scores 
stems from the fact that they are systematically 
related to whatever amount of random measurement 
error is contained in the static scores. There 
again seems to be general agreement that issues 
of measurement error, although potentially impor- 
tant in all studies, are of critical importance 
in this area of measurement and analysis of change. 

3 

Perhaps an example will help to illustrate the 
scores of this problem. In Table 2-1, X represents 

^This example is based on a discussion by 
Bereiter, 1967, p. 10. 
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the initial static score and Y represents the final 
score of the same measure* Next let r x and r^ be 
used to designate the internal consistencies 
(reliabilities) of the initial and final scores 
respectively, and let r^ Y represent the stability 
across the interval calculated via the product- 
moment correlation coefficient between the initial 
and final scores. Of interest to us here is how 
various combinations of internal consistency and 
stability affect the internal consistency (relia- 
bility) of the derived raw gain score, represented 
by Tlle f^ures are based on the generally- 

observed fact that both the internal consistency 
and the variance of the static scores are constant 
across time. 



TABLE 2-1 

RELIABILITY OF RAW GAIN SCORES i 
AN ILLUSTRATION 



Case 



Internal 

Consistency Stability 



Internal 
Consistency 
of Raw Gain 
Score 



I 


II 

X 

u 


r Y = 


.8 


r^ 

• 

1! 

S? 

U 


r Y-X “ * 33 


II 


r x “ 


r Y " 


.9 


r XY = * 7 


r Y-X * * 53 


III 


r x - 


* 

II 


.8 


a 

» 

» 

O 


r Y-X = - 80 
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Inspection of this table reveals a seemingly 
incomprehensible situation; namely, if one is 
interested in gain scores which have high relia- 
bility, he should apparently seek measures which 
have high internal consistency but low stability 
across time! But if he used such measures, how 
would he know whether or not he has measured the 
same thing at the two points in time? Bereiter's 
answer to this apparent paradox is one to which 
other authors seem to agree: 

Where it becomes crucial to decide 
whether or not one is measuring change is 
in the selection or construction of the 
measuring instruments. If one is measur- 
ing change, then it is as measures of 
change and only as measures of change 
that the validity and reliability of his 
instruments have any importance (Bereiter, 
1967, p. 14). 

Thus, the effect of measurement error in 
static scores from which raw change scores are 
derived is to decrease the reliability of the 
resulting change score. One should seek measures 
which are as reliable (i.e., internally consistent) 
as possible but which are not so stable across 
time that no change may be observed. This very 
important distinction between '‘split-half’* vs. 
"test-re test" reliability has not received as 
much attention as it deserves. (See Heise, 1969, 
for an example of a thorough treatment of this 
distinction. ) 

In the next chapter, we will examine in 
considerable detail the internal consistency and 
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stability coefficients for a set of criterion 
dimensions from the Youth in Transition study. 
Since these dimensions were selected before the 
above-mentioned suggestions were available, it 
will be of considerable interest to note whether 
the observed reliabilities and stabilities possess 
the previously-described characteristics to a 
degree sufficient to warrant using raw change 
scores for analyzing at least some of the dimen- 
sions. In this light it is of interest to note 
that Shaycoft concluded that raw gain scores 
based on Project TALENT'S measures of aptitude 
and ability had reliabilities so low as to render 
them analytically useless (Shaycoft, 1967, pp. 

4-19 through 4-30). 

How to Improve on Raw Gain Scores 

So far in this chapter we have noted areas of 
apparent consensus regarding some limitations in 
the use of raw gain scores. Unfortunately, there 
are almost as many solutions to the problems posed 
by these limitations as there are authors who have 
investigated the issues. A significant portion of 
the lack of agreement among the proposed solutions 
arises because of the many and varied potential 
purposes for gain scores. A partial list of such 
purposes would include the following: 

- to provide a dependent variable for 
subsequent analyses 

- to select exceptional individuals 
for additional study 
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- to obtain an indicator for a concept 
or construct in such a v;ay that its 
relationship with other variables 
conforms most closely to a given 
theory 

- to estimate change for an individual 
with respect to a group 

- to examine mean changes for groups 

Although the ingredients of this list are not 

intended to be mutually exclusive, they perhaps 

serve to illustrate the importance of keeping in 

mind the purpose (s) which gain scores will serve 

4 

once they have been derived. 

On the Youth in Transition study, the primary 
interest in deriving change scores relates to 
their use as dependent variables in subsequent 
analyses. Therefore, let us examine a few of the 
previously suggested procedures for improving on 
raw gain scores as dependent variables. 

The Lord Procedure (see Lord, 1967, pp. 21- 
38). The Lord procedure considers an "initial" 
score X and a "final" score Y applied to each of 
a sample of subjects on two occasions. True scores 
X T and Y t for each individual at these times are 
postulated, following directly from similar formu- 
lation in classical test theory. The procedure 
is aimed at estimating a true difference or gain 

4The reader may recall from the first chapter 
that at least three different, but related, poten- 
tial uses of change scores are being considered in 
the development of this general-purpose strategy. 
(See pp. 7-8.) 
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score G t = Y t -X t for each individual via the deter- 
mination of regression coefficients for an equation 
of the form: G^, = G+3 Gx#Y (X-X)+$ GY#x (Y-Y) . 5 (1) 

Lord shows that, for a sufficiently large number 
of cases, the average true gain, G, may be esti- 
mated by the arithmetic difference between the 
observed means? namely G - Y-X. The estimated 
true gain (Formula 1) then follows from assuming 
that the variances of the initial and final scores 
are equal (as is typically the case* and that the 
errors are uncorrelated with the true scores. The 
procedure then reduces to one of estimating the 
two partial regression coefficients as follows: 



(l-r Y ) 



r XY s Y 



r X + r XY 



8 



and 



( 2 ) 



GX-Y 



1-r 



XY 



r„ - r 



8 



XY 



- <l-r x ) 



r XY s X 

s„ 



GY-X 



1-r 



XY 



( 3 ) 



As before, r x and r Y represent the reliabilities 
of the initial and final scores respectively. 

These reliability coefficients are first estimated 

bln these and Following discussions, a bar 
above a term designates the average value across 
individuals and 8 will be used to represent 
standard partial regression coefficients via the 
usual subscript notation showing the correlated 
resiwi. ' 9 zed variables before the dot and the 
variable *sed in obtaining the residuals after the 
dot. 
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by an independent procedure and then, along with 
the observed stabilities and standard deviations, 
are substituted in equations 2 and 3 to solve for 
the regression coefficients. These coefficients 
are in turn substituted into equation 1 (along 
with the difference between the observed means) in 
order to solve for an individuals true gain score. 

In summary. Lord's procedure permits the 
estimation of an individual's true gain, using the 
individual's observed initial and final scores and 
statistics based on the total set of observed indi- 
viduals . ® 

The Bereiter Procedure (see Bereiter, pp. 3- 
20) . Whereas the major objective of the Lord 
procedure was to derive an individual measure of 
true gain so that this measure can then serve as 
a new dependent variable to be predicted from 
other characteristics of interest, the Bereiter 
procedure is aimed at estimating such relation- 
ships (i.e., between predictors and true gain 
scores) directly. Specifically, to obtain the 
correlation between an independent (predictor) 
variable W and the final static score Y in a way 
that "controls for" the individual's score on the 
initial dependent variable X, he suggests using 
the following formula: 

6For a critical review of the underlying 
assumptions, see Cronbach & Furby, 1970, pp. 

69-70. 
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r WY - r x 



(4) 



I — 2- rrr 

V r x" r WX <V r x“ r XY 



Note that it is X^, the initial true score, which 
is being partialled out of the WY relationship in 
the formula. This indicates the importance to 
Bereiter of correcting for the unreliability in 
the initial score. Because the sign of the 
denominator in Formula 4 will always be positive, 
it is the sign of the numerator which determines 
whether the estimated relationship will be positive 
or negative. 

Bereiter shows that if the initial raw score 
(rather than the true score as in Formula 4) is 
partialled out, the numerator of the estimated 
relationship is r^-r^r^. Thus, taking into 
account the initial score reliability could 
actually reverse the sign of the relationship 
based on raw score calculations. Therefore, 
whether or not one corrects for this unreliability 
has potentially important implications. This 
decision is not an empirical one; rather, it 
follows from the analyst's decision as to whether 
he seeks a set of change scores orthogonal to the 
initial observec scores or to the estimated initial 
true scores. 

Bereiter next addresses the question as to 
whether or not a similar correction for unrelia- 
bility in the final Y scores should be made. It 
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would seem consistent with our objective (i.e., to 
estimate the relationship between the independent 
variable W and Y^-X^, , the true gain) to make such 
a correction. Bereiter* s argument is that this 
adjustment is not critical. This follows from the 
fact that the adjustment takes the form of replac- 
ing i' x with the product r^r^ in the second factor 
of the denominator of Formula 4. Whenever r Y is 
less than one (and it can never exceed 1) the 
effect of this change is to produce a smaller 
denominator which, in turn, results in a larger 
absolute value for the estimated relationship. 
Unlike the correction for initial score unrelia- 
bility, however, the direction of the two estimated 
relationships (r._, v and r__, v ) will always be 

WXfr, *A_, WY • Am 

the same? one will' simply observe that r 



wy t *x 

will usually be larger than will r._, Y . 

From this Bereiter concludes that the choice 
between these two coefficients is not critical. 



In short, then, the Bereiter procedure may 
be used to estimate directly the relationship 
between a predictor variable, W, and true gain 
score, - X T , without ever estimating the true 
gain score itself. As in the Lord formulae, 
initial and final observed scores at both indi- 
vidual and average levels, as well as observed 
interrelationships among these scores, are used 
in obtaining the estimated relationship. His 
derivations suggest that the practical effect of 
adjusting these estimates for final score 
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unreliability is relatively unimportant compared 
to the adjustment for initial score unreliability. 

The Cronbach-Furby Procedure (see Cronbach & 
Furby, 1970). Like the Bereiter procedure, Cronbach 
and Furby suggest a means of estimating relation- 
ships between predictor variables and estimated 
true gains without actually calculating the true 
gain scores. In fact, they conclude that 

...gain scores are rarely useful, 
no matter how they may be adjusted or 
refined (Cronbach & Furby, 1970, p. 68). 

In spite of this stand, they propose a procedure 
for estimating true gains, both because they feel 
it provides a better estimator to use in the 
limited cases where they recommend using change 
scores, and because 

Very likely some investigators will 
decide to obtain change or difference 
scores, even for problems where we consider 
such measures inappropriate. Such a person 
will often find one of our estimation 
formulas better than those now suggested 
in the literature (Cronbach & Furby, 1970, 
p. 68) . 

The Cronbach-Furby discussion presents an 
important extension of the Bereiter model in that 
(a) several W (predictor) variables may be examined 
simultaneously (as would be the case, for example, 
in multiple linear regression mode's) , (b) their 
proposed model is appropriate for analyses of 
differences between two variables in a cross- 
sectional study as well as for longitudinal 
analyses, and (c) it introduces a new class of Z 



24 



YOUTH IN TRANSITION 



variables; namely, those variables measured at the 
time of (or after) the final measure, but which 
might be used to further refine the estimate of 
true gain. 

With respect to the Youth in Transition 
project, extensions b and c above offer little if 
any help. However, the ability to handle several 
predictor variables at once represents a poten- 
tially important addition to previously available 
procedures . 

In addition, the authors demonstrate that if 
one'j objective is to identify individuals who 
have gained (or lost) an exceptional amount, then 
the individual's true residual gain need not be 
calculated. Rather, the "raw residual-gain 
score," D*X = Y-Y-$ y<x (X-X) , 7 is well suited for 
such a purpose. (Notice that this score is not 
equivalent to the raw gain score.) However, if 
the individual ' s true residual gain is to be 
estimated, the authors provide a different method 
for estimating it. 

Thus, the Cronbach-Purby paper presents an 
extension of the previously discussed Bereiter 
procedure via a procedure to be used for estimating 
relationships between true gain scores and a set 
of dependent variables. In addition, the authors 
suggest the use of raw residual-gain scores 

7 Formula 21 (Cronbach & Furby, 1970, p. 74) 
appears to be correct in this regard. A "correc- 
tion" (Errata, 1970, p. 218) is in error. 
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when the objective is to identify exceptional 
gainers or losers within a group of individuals. 

Summary 

All three of the procedures discussed above 
(Lord, Bereiter, and Cronbach-Furby) deal with 
methods of improving on raw gain scores as 
dependent variables. This improvement was seen 
to be necessary because of the regularly observed 
negative correlation between raw gains and initial 
scores, the positive correlation between raw gain3 
and final state scores, and also because of the 
confounding of random measurement error with the 
raw gain scores. The three procedures outlined 
do not all address themselves to the same objec- 
tives. This illustrates the need for the analyst 
to identify carefully the purpose (s) of the gain 
score he seeks before choosing a procedure for 
producing such a score. 

It is worth noting again here that none of 
the procedures discussed make efficient use of 
data from more than two points in time; thus, 
they all fail to meet one of the objectives put 
forth in the first chapter as a desired condition 
for our general-purpose strategy. Coupled with 
the desirability of predicting the static cri- 
terion scores in order to achieve analytic objec- 
tives of considerable importance in their own 
8 

right, the failure of gain scores to meet this 

8 See Bachman, 1970, for an example of a 
report based on analyses of this sort. 
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objective casts even further doubt upon their 
general utility in our longitudinal analyses 
model . 
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Chapter 3 

STABILITY VS. CHANGE 
IN THE YOUTH IN 
TRANSITION DA TA 



This chapter is devoted to an evaluation of 
change scores for longitudinal analyses of the 
Youth in Transition data. As an integral part of 
this evaluation, adjusted gain scores are compared 
with raw change scores, with an eye to noting 
whether the advantages of adjusted gains described 
in the previous chapter actually are observable in 
the available data. 

Stability in the Criteria 

Table 3-1 presents the means and standard 
deviations of 18 criterion dimensions^ measured 
at each of the first three data collections. (For 
reasons described in Chapter 1, these analyses are 
based on the 1,374 boys who stayed in the same 
school throughout their sophomore to senior years.) 

The data in Table 3-1 allow us to investigate 
whether or not the school environment is bringing 
about consistent changes in our criterion dimen- 
sions. If the school were exerting an important 

lSee Bachman, et al., 1967, Chapter 4 for a 
description of the composition of these measures. 
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TABLE 3-1 

TIME 1, 2, AND 3 
MEANS AND STANDARD DEVIATIONS 
OF 18 SELECTED CRITERIA 
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and consistent influence on these criteria, we 
might expect to find the means in Table 3-1 moving 
in the same direction across time. If the school 
exerted a facilitating effect, the Time 3 mean 
would be expected to be larger than the Time 2 
mean which, in turn, should be larger than the 
Time 1 mean. If the school exerted a debilitating 
effect, jusu the opposite pattern should be 
observed; that is, we would expect the means to 
drop across time. 

A second type of school effect may be observed 
by examining the standard deviations in Table 3-1. 

If schools were causing students to become more 
alike, then the scores would tend to converge more 
as time passed. This convergence would be indicated 
by a “shrinkage” in the standard deviations at 
subsequent data collections. On the other hand, 
if schools were encouraging students to become 
less alike (as might be the case in schools which 
truly developed the student's individuality, for 
example) , then we would expect to find that the 
standard deviations are increasing across time. 

This second type of school effect bears directly 
on whether schools are acting as a "conforming" 
or a “non-conforming” agent. 

When the data in Table 3-1 are examined, the 
overall picture which emerges is one of consider- 
able stability, both in the means and in the 
standard deviations. We shall see later in this 
chapter that this stability exerts an important 
effect on the potential use of change scores. 
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A few of the dimensions in Table 3-1 show 
evidence of systematic shifts in the means. 2 
However# none of these shifts appears to be very 
large. As a matter of fact# Job Information is 
the only dimension to evidence a shift in which 
the mean gain equals or exceeds one- quarter of a 
standard deviation across both intervals. And 
none of the dimensions demonstrate large conver- 
gence or divergence in their scores. At this 
point, then, we have seen little evidence of 
either type of school effect in these criterion 
dimensions. 

Another way to examine the stability of 
measures repeated at two or more points in time 
is through the use of correlation coefficients. 
Instead of asking whether or not there are shifts 
in the means and/or variances# this second kind 
of investigation focuses on whether individuals 
change across time relative to the other indi- 
viduals in the sample. As more and more indi- 
viduals hold the same relative position through 
time, the correlation between the scores at the 
beginning and end of an observed interval will 
approach unity. 

In examining such "stability coefficients," 
we cannot ignore the potential effects of 

^Job Information, Self-Esteem, Ambitious Job 
Attitudes, Internal Control, and Trust in the 
Government show increases across both intervals, 
and Positive School Attitudes shows a consistent 
decrease. 
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measurement error. Operationally, the issue may 
be thought of as trying to distinguish whether the 
lack of perfect. stability is due to lack of perfect 
reliability in the measuring instrument or whether 
"real" shifts have taken place. Having data from 
three points in time helps to make this distinc- 
tion. In a panel study of political attitudes 
during the 1956, '58, and '60 elections (Converse, 
1963) , Converse discusses some of the possible 
implications of the stability of his data from 
three waves. 

The most revealing statistical property 
of these attitude-change data emerges when we 
consider not simply the correlations between 
the same attitudes over two-year spans, but 
also the correlation for each attitude between 
the initial and terminal interviews, a span 
of four years. For we discover that these 
t^ — to— t^ correlations tend to be just about 
trie same magnitude as the t.-to-t^ correla- 
tions, or the t 2 -to-t 3 correlations. That 
is, surprising though J it may be, one could 
predict the 1960 attitudes on most of these 
issue items fully as well with a knowledge 
of individual attitudes in 1956 alone as one 
could with a knowledge of the more proximal 
1958 responses. Furthermore, the tendency 
toward parity of the three correlations is 
clearest among the issue items with greatest 
turnover; among the more nearly stable items, 
the four-year correlation tends to be slightly 
lower than the two-year correlations , a 
pattern which is of course much closer to 
our intuitive expectations (Converse, 1963, 
pp. 7-8) . 

Let us consider a model in which an observed 
score is comprised of a true score plus a random 
measurement error component. Let us further assume 
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that in the observed population the error compo- 
nents are distributed normally around a zero mean, 
are uncorrelated with the true score, and are 
uncorrelated across time. Now if the true scores 
were perfectly stable, the observed correlations 
would vary from unity because the error component 
varies. However, since the error components are 
serially uncorrelated, the observed score correla- 
tions would not be expected to vary with the 
length of the interval. Such a model could account 
nicely for the Converse data. 

On the other hand, if real change in relative 
position were taking place (i.e., if the true 
scores were not perfectly stable) , then we would 
expect to observe lower stabilities for longer 
intervals and higher stabilities for shorter 
intervals. This is due to the fact that when 
real change is occurring, the longer the interval, 
the more reordering or changing we would expect. 

Table 3-2 presents the stability coefficients 
for the 18 dimensions contained in Table 3-1. 

Again, the overall picture is one of considerable 
stability. However, the data in this table are 
consistent with the idea that some real changes 
in our criteria may be taking place during the 
high school years. This follows from the observa- 
tion that the highest stability coefficients are 
found in the column corresponding to the shortest 
interval (Time 2 to 3), the lowest coefficients 
are located in the column corresponding to the 
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TABLE 3-2 

TIME 1-3 , 1-2, AND 2-3 CORRELATIONS 
FOR 18 SELECTED CRITERIA 





Time 1-3 
(30 mos) 


Time 1-2 
(18 mos) 


Time 2-3 
(12 mos) 


Job Information 
Test 


.53 


.59 


.61 


Positive School 
Attitudes 


.42 


.49 


.57 


Negative School 
Attitudes 


.41 


.47 


.54 


Need for Self- 
Development 


.50 


.56 


.64 


Need for Self- 
Utilization 


.42 


.48 


.54 


Self-Esteem 


.49 


.54 


.66 


Negative Affec- 
tive States 


.52 


.56 


.70 


Happiness 


.47 


.54 


.63 


Somatic Symptoms 


.42 


.52 


.62 


Social values 


.41 


.51 


.54 


Ambitious Job 
Attitudes 


.36 


.46 


.52 


Internal Control 


.32 


.42 


.51 


Trust in People 


.35 


.37 


.47 


Trust in the 
Government 


.33 


.46 


.48 


Delinquent 

Behaviors 


.48 


.53 


.63 


Academic Achieve- 
ment (Grades) 


.58 


.67 


.66 


College Plans 


.40 


.44 


.49 


Occupational 

Aspirations 


.53 


.62 


.66 
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longest interval (Time 1 to 3) , with the third set 
of stabilities (corresponding to the r *l ,.e 1 to 2 
interval) falling between these two extremes. 

In the second chapter we noted that both the 
stability and the internal consistency of a measure 
exert important influences on the reliability cf 
the raw change score. We have just seen that our 
criterion dimensions are relatively stable during 
the periods where we have observed them. Let us 
now proceed in our evaluation of change scores by 
examining the internal consistencies of a selected 
set of criteria. 

Internal Consistencies of the Static Scores 

The internal consistency coefficients are 
important indicators of the quality of the cri- 
terion data. In addition to yielding information 
about the reliabilities of our measures, such 
coefficients may greatly aid our attempts to 
detect and to understand changes. In Table 2-1 
we saw that if a change score is to be optimally 
useful, it should be based on static scores whose 
reliability is as high as possible. Ideally, we 
seek measures v’hose reliability remains approxi- 
mately constant across time so that the same 
amount of measurement error exists in the final 
scores as existed in the initial scores, thus 
increasing the potential utility of change scores 
derived from these static scores. 
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The internal consistency estimates reported 
here are Cronbach-Alpha coefficients (Cronbach, 

1951 , pp. 297-354). This reliability estimate 
was chosen because it is well suited to our rating- 
scale type of data. Additionally, it may be 
interpreted directly as the proportion of true 
score variance accounted for by the observed 
score (Nunnally, 1967, p. 196), a characteristic 
which makes it especially useful for our present 
purposes. Finally, but also very important, an 
efficient procedure for estimating the coefficient 
from item responses is available within our own 
software system.^ 

Table 3-3 presents the reliability estimates 
for the six dimensions identified in Tables 3-1 
and 3-2 as potentially reflecting the greatest 
systematic shifts from waves 1 to 3. It is 
immediately apparent that, except for Internal 
Control, no large shifts are found in the internal 
consistencies from Time 1 to Time 3. Equally 
observable is the fact that these measures have 
reasonably good reliabilities? an average of 71 
percent of the true score variance being accounted 
for by the observed scores for the six dimensions 

4 

in this table. 

3This program Is based on a paper by 
Bohrnstedt, 1969, pp. 542-548. 

^Because of the lack of "balance" (i.e., all 
15 items are reversed) in the Positive School 
Attitudes scale, response bias is very likely lead- 
ing to an overestimate of the internal consistencies . 
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TABLE 3-3 

RELIABILITY ESTIMATES (CRONBACH a) 
FOR SIX SELECTED CRITERIA 



Dimension 


# 

Items 


# 

Reversals 


# 

Choices 
in Re- 
sponse 
Scale 


Tl 


T3 


Job Informa- 
tion Test 


25 


— 


2-5* 


.699** 


.671** 


Self-Esteem 


10 


6 


5 


.737 


.785 


Positive 

School 

Attitudes 


15 


15 


4 


.909 


.912 


Ambitious 

Job 

Attitudes 


13 


6 


4 


.637 


.640 


Internal 

Control 


12 


5 


2 


.554 


.675 


Trust in 
the Govt. 


3 


1 


5 


.637 


.635 



*Recoded to "right-wrong" versions for estimate 

**ltem responses for this test were scored for 
correctness, and the recoded answers were then 
input to the Kuder-Richardson Formula 20 relia- 
bility estimate. (See Nunnally, 1967, pp. 196- 
197.) 
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With respect to the criterion dimensions 
examined to this point, we find evidence of a good 
deal of stability and of internal consistency. 

Let us now turn our attention to the implications 
of these empirical observations for the use of 
change scores. 

Effects of Static Score Stability and Internal 
Consistency on Change Scores 

Earlier we saw that when the stability of a 
static score was fairly high, the internal 
consistency of the raw change score derived from 
the static scores would be low. Then we saw that 
most of our criterion dimensions had relatively 
high stability coefficients. The primary purpose 
of this section is to compare the reliabilities 
of adjusted gain scores to those of raw change 
scores. There is a very pragmatic objective 
which urges that this comparison be made; namaly, 
if the theoretical advantages which previous 
authors suggest should accompany the adjusted gain 
scores are not observable in our data, then we see 
little reason to go to the considerable expense of 
performing the adjustment. If the advantages of 
adjusting are obtained, however, then we will want 
to proceed into the next stage of analysis with 
such adjusted gain scores , and not with raw dif- 
ference scores. 

Following are formulae for the reliability 
coefficients of raw difference, independent gain. 
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5 

and regressed gain scores. Here X and Y will be 
used to represent the initial score and the final 
score respectively, r x and r y the internal con- 
sistency coefficients, and r XY the stability 
coefficient across the interval of observation. 



2 

Reliability of r v s v 
Raw Difference = £■ 



2r XY S X S Y + r X S X 



2r XY S X S Y + S X 



(5) 



Reliability of 
independent Gain 



r X (r X r Y~ r XY ) 

r 2 - 2r 2 j + r 2 

x xirx ^xy 



(6) 



Reliability of r v - 2r 2 + r YV r v 

Regressed Gain = — — — — - (7) 

l - r z 

j. - XY 



Applying these three formulae for reliability 
to the overall (Time 1 to 3) change scores for the 
six dimensions previously discussed yields esti- 
mates displayed in Table 3-4. The internal con- 
sistencies, stabilities, and standard deviations 
of the static initial and final scores are 
represented along with the three types of change 

~ ~~ Sihese formulae are taken from Tucker, et al., 
1966, pp. 468-469, but the notation has been 
modified so as to be consistent with that used in 
Chapter 2. 
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score reliability estimates. It is apparent that 
there is very little to be gained from adjusted 
or regressed gain scores, at least insofar as 
reliability increases are concerned. In addition, 
even reliabilities of the regressed gain scores 
are not very high, averaging .60 for these six 
dimensions. Excluding the Positive School 
Attitudes scale (because of the confounding of 
response bias with internal consistency) , this 
average reliability drops to .55. When we recall 
that these dimensions were selected because of 
their likelihood of showing change, we get an 
even more pessimistic picture of the utility of 
these change scores. 

The Predictability of Change Scores 

At this point in the development of the 
analytic strategy, the use of change scores was 
becoming increasingly doubtful. Before deciding 
to abandon them altogether, however, a series of 
analyses was conducted to test their predictability. 
Previous analyses had already demonstrated that 
many of these criterion dimensions were equally 
predictable (at Times 1, 2, and 3) from back- 
ground characteristics (Bachman, 1970, p. 208ff.) . 
Thus, it seemed highly unlikely that change scores 
would be predictable in these instances where the 
background effects are so stable during the high 
school years. 

The boys* occupational aspirations were 
singled out for these analyses both because there 
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was a consistent drop in the average measure across 
time and because of the fact that this measure is 
very likely to be highly reliable. 6 In addition to 
these factors , it is an extremely important cri- 
terion dimension and one which is likely to be 
affected by things happening in the high school 
environment. 

The sequence of the analyses was to first 
predict in a bivariate fashion to the Time 1, 2, 
and 3 static scores from a set of important back- 
ground characteristics. Then, raw difference and 
independent gain scores were predicted from the 
same set of characteristics. Finally, joint 
piedictions were made from the set of background 
characteristics to the static, raw difference, 

7 

and independent gain scores. 

The results of these analyses are displayed 
in Table 3-5. The stability of the pairwise and 
joint relationships to the static scores may be 
observed by reading the coefficients in the first 
three columns. The next three columns attest to 

*>The method of producing the status of aspired 
occupation score is described in Bachman, 1970, 
pp. 173-174. Because it is a single question, no 
internal consistency coefficient could be calcu- 
lated for this dimension. 

7 

These joint predictions utilised a multiple 
regression technique called Multiple Classifica- 
tion Analysis (MCA), developed by Andrews, et al. , 
1967. See Bachman, 1970, pp. 62-75 for an appli- 
cation of this technique using Youth in Transition 
data. 
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the fact that the corresponding relationships to 
the raw change scores are extremely small, as we 
expected. Of particular note, however, are the 
magnitudes of the coefficients in the last three 
columns. Except for race, the eta coefficients 
all exceed .10, suggesting that relationships do 
indeed exist between the independent gain scores 
and the background factors. 

These apparent relationships are most 
troublesome, because we had reason to believe 
from the coefficients in the first three columns 
that these background characteristics would not 
be related to changes in the aspired occupational 
status. The problem may be resolved when we* 
examine more closely the nature of the independent 
gain score being analyzed here. 

Independent gain scores of this type are 
aimed at adjusting the raw difference score for 
what is typically called a "regression effect." 
This effect is due to the commonly observed 
situation in which extreme scores at the initial 
observation "regress" toward the mean (i.e., they 
tend to be less extreme) at the time of the final 
observation. The raw difference score thus 
"penalizes" a respondent with a high initial 
score by increasing the probability that he will 
have a lower final score (and thus a negative 
raw change) . The independent gain score adjusts 
for this effect by not subtracting all of the 
initial score out. 
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Specifically, these scores take the form 

G = Y - $ y . x X (8) 

where X and Y are, as before, the initial and final 
scores, and v is the regression coefficient of 

jl * A p 

the final score on the initial score. Since 3 y#x 
will, in the usual case, be a value less than 
unity, the resulting regressed gain score will be 
comprised of the final score minus a part of the 
initial score. In the limit, when B y . x * 1# the 
independent gain scores and the raw difference 
scores will be identical. In the other limit 
(i.e., where the stability coefficient is zero), 
3 y#x = 0, and the independent gain score equals 
the final score. 

In most situations, the regression coeffi- 
cient will be neither 0 nor 1, but somewhere 

9 

between. Adjusting in this way thus produces a 
set of scores which are less negatively correlated 
with the initial score man are raw difference 
scores? but these adjusted gain scores will be 
more positively correlated with the final scores 
than are the raw gain scores. Thus, what we are 
observing in the right three columns of Table 3-5 
is, at least in part, this undesirable feature of 

^Algebraically , 3 V . Y = r YV j-Aj (Hays, 1963, 
p. 504). Y X XY( V 

9 

Regression coefficients can be negative, but 
this is highly unlikely in the present instance. 

It would only obtain when a variable correlated 
negatively with itself across time. In Table 3-2, 
the lowest correlation observed was .32. 
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independent gain scores. To the extent that a 
predictor variable relates to a final score in 
about the same degree as to the initial score, it 
will also show an artifactual relationship to this 
kind of independent gain score. 



This is one of major reasons that Lord did 
not define his regressed gain scores via equation 
eight. Rather, he defined a set of scores which 
would be unrelated to both the initial (true) and 
final (true) scores. He accomplished this by 
defining estimated true gain as (Lord, 1967, p. 28) 

6 - G + fS Gx . y (X-X) + S gy . x (Y-Y). (9) 

In this formula, the mean gain (G) may be estimated 
(for a sufficiently large sample) by the mean 
difference (Y-X) , and the regression coefficients 
(adjusted for measurement error) may be estimated 
as follows: 



8 



GX» Y 



d-r )^!X . r + r 2 

Y S X x XY , and 



( 10 ) 



1 - r 



XY 



8 



n- r ) r XY s X + 2 

u r X ; S„ r Y r XY 



GY*X 



( 11 ) 



1 - r 



XY 



7»s before, r x and r y represent the internal 
consistencies of the initial and final scores, 
s x and s y the standard deviations of the initial 
and final scores, and r^ the stability coeffi- 
cient across the interval for which the adjusted 
gain score is being calculated. 
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This procedure outlined by Lord thus avoids 
the undesirable "built-in" relationship between 
the final score and a regressed gain score of the 
form G - Y - 0,, V X. Let us examine the Lord 

Y» X 

procedure in more detail in order to determine 
whether such adjusted gain scores would be useful. 

From Tables 3-1 and 3-3 we observed the 
following things about most of the criterion 
dimensions. 



(A) The means for any one dimension were very 
stable across time. From this the follow- 
ing approximation holds: X = Y 5 M. (12) 

From this it follows that: 

G = Y - X = 0. (13) 



(B) The standard deviations for a repeated 

measure did not change very much across 
time. Thus , it will generally be true 
that: s x = Sy, and from this it (14) 

follows that: S X _ S Y _ ^ (15) 

(C) The internal consistencies for most of 

the criteria did not change from Time 1 
to Time 3. Thus, r x = r y = r. (16) 



Now, substituting appropriately from equations 
15&16 into equations 10 and 11 yields the fol- 
lowing : 



0 



GX-Y 



(1 " r>r XY ~ r + r XY, and 



1 - r 



XY 



( 10 . 1 ) 



60 
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3 



GY*X 



(1 ” r)r XY " r+r XY. 



1 - r 



XY 



( 11 . 1 ) 



From equations 10.1 and 11.1 it follows that 

0 GX*Y = " S GY*X. {17) 

Substituting from equations 12, 13, and 17 into 
equation 9 for the estimated true gain, we see 
that : 

8 - e GXY (X ' M) + ( - B GX.Y )(Y - M >- 
Expanding, we get: 

® “ S GX‘Y X ” ^GX* Y^ “ 8 GX‘Y Y + ^GX* Y^* 

Hence, G « B QX#y (X-Y). (18) 

In terms of the other regression coefficient 
we have an equivalent statement? namely, 

® = ^GY»X^ Y ”^* (19) 

Let us look at a rather typical example where 
the internal consistency (r = .7) and the 
stability (r xy = .5) coefficients are about 
average for our data. 



3 



GX-Y 



(l-r)r xy - r+r XY 



1 - r 



XY 



3 



GX* Y 



- (1-.7) .5 - 



1 - 



^GX» Y = ”* 47 



A 

3=8 



GX» Y 



(X-Y) 



.7+. 75 
25 




* • . 

• c ' 
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G - -.47 (X-Y) 

6 = .47 (Y-X) . 

So, in this example, each individual's estimated 
true gain score would be calculated by multiplying 
his raw difference score by .47. 

The relationships given in equations 12-19 
will be good approximations for most of our 
criterion dimensions. It will in general be true 
that following Lord's procedure to estimate true 
gain will yield a score which is simply a constant 
multiplied by the raw difference score. Thus, 
their correlations with the raw difference scores 
will be very close to 1. And since the constant 
multiplier will always be less than one, the 
estimated true gain scores will have smaller 
variances, making it very unlikely they will be 
more predictable. From this information, we 
conclude that estimated true gain scores have 
very limited utility for analyzing the majority 
of the Youth in Transition criterion data. 

Summary 

In this chapter we have examined a large 
number of criterion dimensions which were measured 
at each of the first three points in time. In 
general, these dimensions were characterized by a 
great deal of stability in their means and vari- 
ances. Also, their auto-correlations suggested 
that most of the sample were maintaining their 
relative position on these dimensions throughout 
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the high school years. In addition, we examined 
internal consistency measures for a set of dimen- 
sions? they indicated that the measures were 
fairly reliable and that the reliability did not 
shift much across time. 

The remainder of the chapter was devoted to 
an exploration of the effects of this general 
stability and reliability on the use of change 
scores in the analysis of these repeatedly 
measured criteria. We witnessed the fact that 
independent gain scores and regressed gain scores 
have higher reliabilities than do raw difference 
scores, but the increase in reliability's by no 
means large. Furthermore, we noticed that the 
independent gain scores were predictable in a 
situation where we expected no prediction to be 
observed. We found that this was due, at least 
in part, to an artif actual relationship between 
the independent gain and final (static) scores 
arising from the method used to calculate the 
independent gain score. 

We next examined a procedure suggested by 
Lord (and outlined in an earlier chapter) to 
derive a true gain score which did not show this 
undesirable relationship to the final static 
score. When the conditions following from the 
earlier-noted stability and reliability of our 
criteria were reintroduced in conjunction with 
Lord's procedure, we observed that the resulting 
estimated true gain scores would not order our 



» 



63 



50 



YOUTH IN TRANSITION 



sample any differently than would raw difference 
scores, and that the estimated true gain scores 
would have smaller variances. 

Therefore, we concluded that for the purpose 
of deriving a dependent variable for subsequent 
analyses, none of the adjusted gain score proce- 
dures result in sufficient improvement over the 
raw difference score to warrant the considerable 
effort and expense necessary to calculate them. 
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Chapter 4 

EVIDENCE OF TRENDS 
AND SUBGROUP ANAL YSES 



To this point, we have seen little evidence 
that change scores, raw or residualized, will 
facilitate the longitudinal analyses of the Youth 
in Transition data. It is the case, however, that 
changes in many of our criteria are occurring in 
at least some of the boys in our sample. 1 This 
raises the two additional questions to be 
addressed in this chapter. Is there any evidence 
of trends in the scores of those boys who do 
indeed change? How shall we identify and analyze 
those sets of respondents that show particular 
patterns of change across time? Let us now turn 
our attention to the first of these questions. 

Evidence of " Trends” in the Observed Changes 

Before discussing the statistical procedure 
for indicating trends in the data, let us first 
examine what is meant by a trend. Consider three 
respondents all of whose T2 scores are in the 
middle bracket of the trichotomy. (See Figure 4-1.) 

lFor example, the stability coefficients in 
Table 3-2, while quite high, still leave room for 
quite a bit of change to be taking place across 
each interval. 
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Respondent 

Type 

22 

20 

A 

(Lo-*Mid) 

18 

16 

14 

22 

B 20 
(Mid+Mid) 
18 

16 

14 

22 

C 20 
(Hi+Mid) 

18 

16 

14 



FIGURE 4-1 



"TRIADS" IN THE JOB INFORMATION TEST 
Expected T3 Score (based 

on Observed T1 & T2 Scores) Observed T3 Score 
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Respondent A was in the lower bracket of the 
trichotomy at Tl, respondent B was in the middle 
bracket at Tl, and respondent C was initially in 
the upper bracket. If there were evidence of a 
trend- type change across the Tl to T3 interval, 
we would expect the shift between T2 and T3 to be 
in the same direction as the observed shift 
between Tl and T2. The dotted lines in the 
"Expected T3 Score" column of the Figure indicate 
that the magnitude of the expected T2 to T3 shift 
is likely to be less than that observed between 
Tl and T2. Among the several reasons for this 
expected decrease in magnitude is the one owing 
to the crude classification procedure inherent in 
a trichotomy. That is, rather arbitrary cutting 
points were used to separate low and middle scores, 
and middle and high scores. Thus two respondents 
whose scores differ very slightly may fall on 
opposite sides of one of these cutting points. 

Since the observed scores for these two respondents 
may differ only as a function of measurement error, 
we may well have misclassified one or both of them 
in trichotomizing the scores. 

The effect of such misclassifications may be 
seen better if we consider a specific example. 
Suppose that several respondents whose true scores 
should have placed them initially in the Middle 
category at Tl had a sufficiently large (and 
negative) measurement error component in their 
observed scores that they were classified as Low 
instead. Now if there really are differences in 
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the way that those correctly classified as Low, 
Middle and High change across time, the misclass.x- 
fied respondents above ought to b; V| •:'*e more like 
the Middle group than the Low group. Since the 
Middle group will tend to remain fairly stable 
across the total interval, these misclassified 
respondents would not be expected to shift upwards 
to the same degree as would those respondents 
correctly classified as Low. Thus, the overall 
T2 to T3 shift, for an A-type respondent .(see 
Figure 4-1) would not be expected to be as large 
as the shift observed between Tl and T2. 

The second column of plots in Figure 4-1 
presents the observed T3 scores for respondent 
Types A, B, and C. As can readily be observed 
(end as we will shortly see in more statistical 
torms) , no support is found for the notion that 
trend-like changes are occurring in the Job 
Information Test. On the contrary, the direction 
of the T2 to T3 change for respondent Types A and 
C is exactly opposite to the direction expected 
if a trend model were to be used to account for 
the observed shifts. 

What v;e sought at this point was a procedure 
which would take advantage of all three observa- 
tions and of whatever evidences of trends could 
be observed. Our initial efforts were focused on 
a method which could be used to discover whether 
the T2 version of the criterion dimension would 
measurably increase the prediction of the T3 score 
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beyond whatever predictive power the T1 score had. 
We reasoned that if there were trend-like changes 
occurring for some respondents, a model based on 
additive prediction would not fit the data as well 
as one which permitted prediction from an inter- 
active term. This additional prediction would be 
due to the fact that a trend-like shift would 
necessitate a non-additive explanatory term to 
account for those changers whose scores are 
deviating from the typically observed overall 
pattern of stability. This is not to say that the 
discovery of a non-additive explanatory term would 
indicate that trends are present; rather, it is to 
say that no trends are likely to be discovered if 
an additive model fits the data as well as a model 
which permits interaction. Thus, the presence of 
interaction is seen to be a necessary, but not 
sufficient, condition for the discovery of trend- 
like changes in situations where the overall 
stability is high. 

The procedure which satisfied the objectives 
described above involves comparing the additive 
prediction of a Time 3 criterion d i .tension from 
T1 and T2 versions of that dimension with predic- 
tion from a specially-constructed combination of 
the Times 1 and 2 scores. This special T1 and T2 
combination score is defined as follows: 

1 - Trichotomize the T1 and T2 versions of 
the criteria via a bracketing procedure 
which includes in the middle bracket 
observations falling within 3/4 of a 
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standard deviation of the overall mean. 
(If the scores were distributed normally 
about this mean, then the middle bracket 
would include about 55% of the scores 
with the upper and lower categories each 
containing about 22%. ) 

2 - Develop a nine-category T1 by T2 com- 
posite score for each dimension from the 
T1 by T2 bivariate table based on the 
trichotomies from step 1. The nine 
values were assigned according to thr 
table below. 



TABLE 4-1 

CONSTRUCTION OF A NINE-CATEGORY 
COMPOSITE (T1 BY T2) SCORE 



Time 2 Trichotomy 



Time 1 


1 


2 


3 


Trichotomy 1 


1 


2 


3 


2 


4 


5 


6 


3 


-J 

t 


8 


9 



a 



3 - Predict to the T3 criterion from the T1 

and T2 trichotomies using Multiple 
Classification Analysis (MCA) . 

4 - Predict to the T3 criterion from the T1 

by T2 composite score using one way 
analysis of variance. 

- ^The "overall mean” is the average of the 
means of the criteria at all three points in time. 
For j , most part, the standard deviations were 
equivalent for the three time periods. Where this 
was not the case, an average of the three standard 
deviations was used. 
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5 - Compare the multiple correlation (R) from 
step 3 with the eta coefficient (n) from 
step 4. 

The comparison in step 5 of this procedure 
provides the essential data for indicating trends. 
For if the additive combination of T1 and T2 
scores (summarized by the multiple correlation 
coefficient, R, in the MCA analysis) does as well 
as the prediction based on the composite score 
(as reflected by the eta coefficient in the 
analysis of variance) , then there is little 
support for the notion that there is an overall 
trend in the Tl to T3 changes. In other words, 
the direction of the Tl to T2 change should be 
directly related to the direction of the T2 to T3 
change if trends are to be observed. 

For example, consider an individual with a 
composite score of 8 (see Table 4-1) . This indi- 
vidual has dropped from the upper to the middle 
bracket in the Tl to T2 interval. If this drop 
were indicative of a trend, then those with com- 
posite scores of 8 would be expected to have 
lower T3 scores than would people who have com- 
posite scores of 5 (i.e., those whose scores 
remained stable in the middle category at Tl and 
T2). As the data below will document, just the 
opposite effect is observed. 

To illustrate the application of this 
procedure, the Job Information Test will be used 
as the criterion. This dimension was chosen 
because it did evidence an overall mean shift 
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upwards across both the Times 1 to 2 and the 
Times 2 to 3 intervals. (See Table 3-1.) In 
each of the nine cells of Table 4-2 below, the 
frequencies (n) , means (x) , and standard devia- 
tions (sd) of Time 3 scores are presented for the 
respective combination of Time 1 and Time 2 scores. 
Additionally, an expected cell mean (x) based on 
the adjusted MCA coefficients is given for each 
cell. Also shown in the table as marginal totals 
are the frequencies (N) and means (X) for the T1 
and T2 trichotomous variates. Finally, the MCA 
coefficients for each level of the two trichotomies 
are displayed in the last row and column, and the 
multiple correlation coefficient (R) and eta 
coefficient (n) are given in the lower right 
corner of the table. 

The two coefficients (R and n) are obviously 
extremely similar, suggesting that the additive 
prediction (via MCA) of the T3 criterion from the 
T1 and T2 trichotomies is almost as good as the 
prediction which also incorporates interaction 
between the T1 and T2 trichotomies in predicting 
the T3 criteria. How good is the prediction from 
the MCA? To answer this question, a cell mean 
was predicted for each of the 9 cells in the 
table, using the MCA coefficients. (For example, 
the prediction for the 1,1 cell was obtained by 
taking the algebraic sum of the grand mean and 
the row 1 and column 1 coefficients? 

*1,1 = 19.07 + (-1.04) + (-2.45) = 15.58.) 
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Inspection of the table reveals remarkable simi- 
larities between the observed and predicted cell 
means. As a matter of fact, the only cell in 
which these two means differed by more than .3 is 
the 3,1 cell in which the observed mean is based 
on only 2 observations! Also noteworthy in the 
table is the fact that the respondents who drop 
in score from T1 to T2 have higher T3 scores (and 
those who gain in score from T1 to T2 have lower 
T3 scores) than respondents who had similar T2 
scores as did the "movers” but whose scores had 
remained stable at that level from Tl. As 
mentioned previously, these observations are 
certainly not consistent with the notion that the 
observed movement reflects a trend over time. 

Another variable which showed a consistent 
movement in the means across time (see Table 3-1) 
was Positive School Attitudes. Of special interest 
was the fact that the attitudes of those staying 
in the same school were getting consistently less 
positive as time passed, in Table 4-3 below are 
displayed data parallel to those given for Job 
Information in the preceding table. Again, no 
evidence of trend-like shifts acrosa the Times 1 
to 3 interval may be observed. 

Nine other criterion dimensions were examined 
(using the same procedure) for evidences of trends. 
As for the two dimensions reported above, remark- 
able similarities between the multiple correlation 
and eta coefficients were observed. These 
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relationships ar*e summarized in Table 4-4 below; 
the more detailed presentation of the data from 
analyses of these dimensions is given in Appen- 
dix C. 



TABLE 4-4 



OVERALL PREDICTION OP NINE 
TIME THREE CRITERION SCORES FROM AN 
ADDITIVE COMBINATION OP TIME ONE AND 
TIME TWO SCORES (R) VS. PREDICTION F«OM 
A TIME ONE BY TIME TWO COMPOSITE SCORE (n) 



Dimension Name 


R 


n 


Negative School Attitudes 


.540 


.543 


Academic Achievement Value 


.449 


.453 


internal Control 


.514 


.515 


Self-Esteem 


.615 


.616 


Negative Affective States 


.646 


.647 


Social Values 


.558 


.558 


Ambitious Job Attitudes 


.502 


.504 


Aspired Occupational Status 


.645 


.652 


Delinquent Behaviors 


.640 


.641 



In short, the procedure outlined earlier in 
the chapter uncovered no evidence of trend-like 
changes in the eleven dimensions examined. Thus, 
the procedure adds little if anything to change 
the overall picture of stability which has pre- 
viously emerged. However, the detailed tables 
(4-2, 4-3, and Appendix C) document the nature of 
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the shifts which are taking place from beginning 
sophomore to end junior to end senior years for 
those boys who did stay in the same school during 
this entire period. For this descriptive purpose, 
these tables may have some utility. Let us now 
turn our attention to the matter of subgroup 
analyses, the second question to be addressed in 
this chapter. 

Analysis of Subgroups 

The fact that few overall trends have been 
observed thus far in no way precludes the possi- 
bility that subgroups within the total sample may 
be changing in identifiable and interesting ways. 
For example, a monograph has recently bean 
written by other members of the Youth in Transition 
staff which focuses attention on comparisons 
between and among three major subgroups: those 

who drop out of high school, those who graduate 
from high school but do not continue their educa- 
tion further, and those who pursue their education 
beyond high school (Bachman, et al., 1971). 
Subsequent analysis efforts to be reported in 
later publications will be aimed at other sub- 
groups of interest? examples of such subgroups 
include respondents with military experience, 
graduates of work-study programs, and those who 
continue their formal education beyond high school. 

One general method of determining whether or 
not subgroups differ along the criterion dimension 
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is to develop a set of subgroup categories which 
are both totally inclusive and mutually exclusive. 
Having thus classified everyone in the sample 
using this "subgroup variable," two MCA runs are 
made to predict a dependent variable of interest? 
one run simply predicts from the set of selected 
independent variables to the dependent variable. 
The other run is similar except that the "subgroup 
variable" is added to the predictor list. Should 
the multiple correlation coefficient from the 
second run be significantly higher than that from 
the first run, one can conclude that there are 
differences among the subgroups which are worth 
further exploration. However, should the two 
multiple correlation coefficients be essentially 
the same, then one can conclude that the "subgroup 
variable" does not contribute much to the predic- 
tive ability of the set of independent variables. 

In other instances, one’s interest in sub- 
groups extends to more complex analytic areas, 
however. Within the observed general setting of 
overall stability, it would be necessary for some 
subgroups to be moving in one direction while 
other subgroups are changing in the opposite 
direction on the same dimension, and detection of 
such "counterbalancing" would require a different 
technique from that described above. If we should 
observe instances where definite but opposing 
shifts could be identified with various subgroups 
which produced no observable effect in the 
aggregate, then we would have identified an area 



78 



TRENDS AND SUBGROUP ANALYSES 



65 



where more sophisticated analyses efforts would be 
required. In such areas, for example, we might 
want to do regression analyses within each sub- 
group using the same set of predictor variables. 

Of particular interest in such analyses would be 
the identification of any variable whose effect is 
facilitative for one or more subgroup (s) and 
debilitative for other (s). 

At the present time, analysis efforts of this 
type have not been undertaken for several reasons: 

(1) subgroups have only recently been iden- 
tified, and in many cases even initial 
analyses are not yet underway, 

(2) considerable effort to date has been 
focused on the investigation of family 
background predictors whose effects seem 
to be rather unidirectional, 

(3) school effects analyses (perhaps the most 
interesting area for investigations of 
this type) are still in the planning 
stages due to the very complex and time- 
consuming nature of developing measures 
of characterizing the schools, and 

(4) as mentioned earlier, we need to spend 
most of our time developing and utilizing 
procedures which have general utility 
across most or all areas of our analytic 
framework? thus, concentrating efforts 

on one or more subgroups has been a 
tempting activity which we have had to 
avoid . 
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In order to anticipate what such analyses 
might tell us, the author has conducted a brief 
investigation of the relationship between College 
Plans and Status of Aspired Occupation. Of special 
interest here is the question of what effects, if 
any, on a boy's Aspired Occupation result from a 
change in his College Plans. The data displayed 
in Figure 4-2 are of interest for several reasons. 
First of all, we note that those who consistently 
plan to go to college (the 111 group) have the 
highest aspired occupational status at all three 
points in time, whereas those who never plan to go 
to college (the 000 group) have the lowest occu- 
pational status across time. Secondly, both at 
Time 1 and Time 3 (and also at Time 2 except for 
the 101 group) those planning to attend college 
have higher aspired occupational status than do 
those not planning to attend college. Thirdly, a 
shift in college plans is accompanied (in all 
cases except for the 101 group) by a similar 
shift in aspired occupational status? that is, the 
two groups who originally planned college but 
subsequently dropped those plans (110 and 100) 
are observed to have the largest drop in aspired 
occupational status at the time they dropped their 
college plans, whereas those groups not originally 
planning college who later changed their plans 
(011 and 001) may be seen to have tie largest 
gain in aspired occupational status at the time 
their plans changed to include college. 
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FIGURE 4-2 

STATUS OF ASPIRED OCCUPATION VS. COLLEGE/NON-COLLEGE PLANS 





Time 1 



Time 2 Time 3 



The ranges represent the minimum and maximum fre- 
quencies underlying the points along each line. 
The triads in the right margin identify whether 
respondents represented by each line had college 
plans (1) or did not have college plans (0) at 
Times 1, 2, and 3 respectively. 
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The data in Figure 4-2 provide an illustra- 
tion of one type of analyses which can be conducted 
for exclusive subgroups (in this case the college- 
bound vs. the noncollege-bound). More sophisti- 
cated analyses of such subgroups will undoubtedly 
be undertaken as a part of forthcoming publica- 
tions . 

Identification of Subgroups of "Changers" 

Subgroups described thus far have been iden- 
tified because of conceptual or substantive 
interest. It is also possible to use empirical 
procedures to idertify subgroups of "changers" for 
additional analyses. 

For example, Trent identified three groups by 
defining an "exceptional" change group as those 
whose (raw) change scores were three-quarters of a 
standard deviation or more above the average change 
and a "negative" change group whose change scores 
were at least tnree-fourths of a standard devia- 
tion below the average, with the remaining group 
members falling in the "average" change group 
(Trent and Medsker, 1968, pp. 178-218). Trent 
and Medsker spend considerable time analysing 
differences among these three change groups. Of 
particular interest to our present discussion are 
the data presented in Table 4-5 below. 

Now, completely apart from whatever "real" 
changes are taking place, we would expect those 
with the lowest initial scores to have higher final 
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TABLE 4-5 3 

INITIAL AND PINAL SOCIAL MATURITY 
MEAN SCORES FOR THREE CHANGE GROUPS 





Exceptional 

Changers 


Average 

Changers 


Negative 

*!hai.gers 


Initial Score 


47.81 


50.46 


55.33 


Final Score 


66.82 


56.21 


48.67 


scores (relative 


i to those in 


the group) , 


and we 



would also expect those with the highest initial 
scores to have relatively lower final scores. 
Again, this may reflect nothing other than the 
fact that measurement error "artificially” 
depressed the observed initial low scores and 
inflated the observed initial high scores. Since 
this measurement error is assumed to be uncor- 
related across time, roughly half of those in the 
extreme groups will have a final score measurement 
error component which is opposite to the initial 
score measurement error component, and we will 
thus observe that those with extreme initial 
scores will have less extreme final scores (i.e., 
their scores will regress toward the mean) . From 
the data presented by Trent, it is not possible 
to know how much of the observed change might be 
due merely to regression? however, it is at least 
safe to conclude that whatever regression effect 

3 This table is based on Table 56, p. 189, in 
Trent and Medsker, 1968. 
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is present has been confounded with the changes 
observed. For this reason, the present author 
does not find such empirical identification of 
subgroups of "changers" to be particularly helpful. 

It is conceivable that a resi dualized gain 
score might be useful in empirically identifying 
subgroups for subsequent analyses. The author is 
quite doubtful about the probability that the 
considerable effort necessary to produce such 
residualized gain scores would be warranted in 
the present data for this purpose alone. Had 
earlier analyses suggested that for more basic 
purposes such a gain score was useful, we would 
have doubtless investigated its utility in these 
analyses as well. 

Summary 

The first part of this chapter was devoted to 
the description of a procedure desi.gned to investi- 
gate trend-like changes in the criterion dimen- 
sions. Applying this procedure to eleven dimen- 
sions resulted in no evidence of trends in the 
Times 1 to 2 vs. the Times 2 to 3 changes. 

The second part of the chapter contained a 
brief description of some analyses performed on 
conceptually-defined subgroups. An illustration 
of one type of data display was given which showed 
how shifts into and out of the college-bound sub- 
group are accompanied by corresponding shifts in 
the boys' aspired occupational status. Finally, a 
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limitation (due largely to regression affects) in 
the empirical identification of subgroups of 
"changers" was described and illustrated, using 
data from a previous study of post-high- school 
youth. 
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Chapter 5 

SUMMARY OF THE 
PROPOSED STRATEGY 



The previous four chapters have described the 
evolution of a strategy for longitudinal analyses 
of survey panel data. In this final chapter, the 
proposed analytic model will be reviewed briefly, 
with special attention given to some critical 
questions around which application of the model 
is based. 

For What Kinds of Studies Is the Model Intended? 

The proposed analytic strategy is focused on 
longitudinal analysis of panel data. Specifically, 
the dependent or criterion variables of interest 
are assumed to be measured on the same set of 
respondents at two or more points in time. Since 
many of the statistical techniques employed in 
the application of the model require relatively 
large sample sizes, ^ it is assumed that the major 
use of the strategy will be found in survey panel 
studies. However, except for considerations such 
as those just described, the model should find 
applicability in any panel study. 

^This requirement is necessary in order to 
get relatively small sampling errors for the 
statistical estimates employed in techniques such 
as MCA. 
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What Kind of Change Score (s) Should be Used? 

An investigator must first decide what are 
his intended uses for change scores. (A partial 
list may be found on pages 17-18.) If one of the 
purposes of a change score is to identify for 
further study those individuals who have gained 
or lost an exceptional amount during the interval 
between observations, a form of residualized gain 
score (see Cronbach and Purby, 1970, pp. 77-80) 
might be helpful. Except for this rather unique 
purpose, however, the calculation of any type of 
gain score appears to be unnecessary and, as we 
saw in Chapter 3, sometimes misleading. 

An additional limitation in the use of change 
scores is that they utilize data from only two 
points in time. In pre-post designs, this is not 
a serious problem; but in panel designs which 
employ more than two observations, a host of 
problems arises. As Figure 1-2 indicated, a 
"four-wave” panel study provides that change 
could be studied across six different intervals. 
How one chooses only some intervals for examining 
change (thereby eliminating others) is a difficult 
problem. This is especially true if the rates of 
change differ from one interval to another. In 
short the author finds himself in agreement with 
Cronbach and Purby (1970, pp. 77-80) who state 
that 



...gain scores are rarely useful, no 
matter how they may be adjusted or refined 
(Cronbach and Purby, 1970, p. 68). 
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Description of a "Parallel Prediction" Approach 

The proposed analytic model utilizes the 
repeatedly measured dependent variables (criteria) 
as static scores. Specifically, it proposes to 
make separate predictions from a specified set of 
independent variables to the First, Second, 
Third,..., and Nth criterion scores. Of special 
interest in these analyses is the identification 
of criterion variables whose overall predictability 
is changing meaningfully across time. Also of 
interest at this stage is the relative importance 
of the independent (predictor) variables in the 
multiple prediction equations. Specifically, 
predictors which systematically increase (or 
decrease!) in explanatory power across time are 
deserving of further attention. 

In cases where neither the overall predicta- 
bility nor the relative explanatory power of the 
predictor variables change over time, the multiple 
prediction equations may be of considerable 
interest in their own right. In such cases, the 
regression equations in the prediction of separate 
static criteria will resemble one another quite 
closely, and any one of them could be used to 
describe the relationships that exist between the 
criterion and the set of predictors. 

In instances where the relative power of the 
set of predictors does change across time, one may 

2Multiple regression models (both linear and 
MCA) will be useful at this stage rn the analysis. 
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be interested in knowing whether there is consistency 
in the kinds of changes which are taking place. A 
procedure is described in Chapter 4 which may be 
used for identifying such "trends." 

Summary 

A "parallel prediction" model for longitudinal 
analysis has been described. The model makes 
separate use of each repetition of the criterion 
dimension. The proposed strategy seems to be 
widely applicable in studies employing panel 
designs; it avoids the messy philosophical and 
analytical problems inherent in the use of any and 
all kinds of change scores, and it provides 
descriptive data which are interesting in their 
own right. 
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Chapter 6 

EPILOGUE 



Since completing the first five chapters (which 
were submitted in partial fulfillment of the degree of 
Doctor of Philosophy in the University of Michigan) 1 , 
the author ht.s applied the proposed parallel predic- 
tion strategy in a limited set of analyses of Youth 
in Transition data. This epilogue will be devoted 
to a brief report of the results of these analytic 
efforts . 

The Use of Raw Change Scores to Identify Differential 
Shifts in Subgroups Across Time 

In Chapter 4 (see esp. pp. 63-68) , a procedure 
was proposed for investigating the issue of change 
via examining subgroup shifts across time. Re- 
ported below are results from two sets of such ana- 
lyses, each set predicting to six important criterion 
dimensions. The first set is based on subgroups 
identified by the Socio-Economic Level (SEL) of the 
respondent’s family, and the second set on the re- 
spondent’s own level of intelligence, as measured by 
the Quick Test (QT) . These two variables have been 

l Jerald G. Bachman and William M. Cave served 
as Co-Chairmen. Other committee members were Frank 
M. Andrews and LaVerne S. Collet. 
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chosen both for their theoretical interest as well as 
for their predictive utility with Youth in Transition 
criteria. 2 

First of all, let us look at the cross-time re- 
? ^i-ionships between SEL and Self Esteem. In Chapter 
Three we saw that (1) Self Esteem scores were quite 
stable in their means and standard deviations across 
time (Table 3-1) , (2) the autocorrelations were in- 
versely related to the length of the interval between 
observations (Table 3-2) , (3) the internal consis- 
tency of the scale remained stable at a respectable 
level from Time One to Time Three (Table 3-3), and (4) 
the reliabilities of the raw and regressed gain scores 
were not too impressive (Table 3-4). Therefore, it 
is not too surprising to note in Table 6-1 below that 
the relationship between SEL and Self Esteem does not 
deteriorate to any degree across time and that the 
overall raw change (Time 3-Time 1 static scores) in 
Self Esteem is rather unrelated to SEL. Similar ob- 
servations are obtained for Negative Affective States 
and Occupational Aspirations. However, two of the 
other three criterion dimensions (Academic Achieve- 
ment Value and Social Values) demonstrate relation- 
ships between overall change and SEL which are lar- 
ger than the static relationships. Can we observe 
interesting subgroup shifts in these latter two 
cases? 



^See Bachman, 1970, for operational definitions 
of these variables and for empirical evidence of 
their predictive power. 
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TABLE 6-1 

RELATIONSHIPS BETWEEN SEL AND 
SIX SELECTED CRITERIA ACROSS TIME 1 



Time 1 

Criterion (Static) 
Dimension Score 


Time 2 
(Static) 
Score 


Time 3 
(Static) 
Score 


Time 1 to 3 
(Raw Change) 
Score 


Self Esteem 


.13 


.10 


.11 


.04 


Negative 

Affective 

States 


.08 


.04 


.08 


.06 


Occupational 

Aspirations 


.35 


.29 


.32 


.05 


Ambitious 

Job 

Attitudes 


.19 


.11 


.12 


.10 


Academic 

Achievement 

Value 


.15 


.08 


.12 


.19 


Social 

Values 


.15 


.07 


.08 


.17 


*Entr’ **s 


in this 


table r. 


re eta coefficients. 


In ordc-r 


to answer this 


question. 


let us first 



plot the results of our "parallel predictions” from 
SEL to the static scores. (See Figure 6-1 below.) 

As can be seen in Part A of this figure, the lowest 
SEL group tended to value Academic Achievement more 
in their senior year than in their sophomore year, 
whereas the groups who are average or above average 
in SEL decreased in their value of academic achieve- 
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FIGURE 6-1 

SEL VS. AVERAGE STATIC (A) ACADEMIC ACHIEVEMENT VALUE 
AND (B) SOCIAL VALUES SCORES AT TIMES 1, 2, AND 3 
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ment. There is a rather strong monotonic relation- 
ship between SEL and average difference in Academic 
Achievement in these data. Thus, the relationship 
between SEL and raw change on this dimension observed 
in Table 6-1 was indicative of a very interesting 
subgroup shift. It should be pointed out that this 
relationship was observed in spite of the fact that 
the magnitude of the static relationships (as re- 
flected by eta coefficients) did not change drama- 
tically across time. 

In Part B of Figure 6-1, we may examine the data 
relating SEL to Social Values scores across time. 

Here we see that the relationship between SEL and 
Social Values has dropped markedly from Time 1 to 
Time 3. However, we again observe a rather strong 
monotonic relationship between SEL and the average 
group differences between Times 1 and 3. As in the 
•previous case, interesting subgroup shifts were evi- 
denced in the figure in a situation where the raw 
change score indicated such a relationship. 

In contrast to these situations, let us examine 
similar plots for the three criteria for which SEL 
and overall change were unrelated. (See Figure 6-2 
below.) In each of these figures, the three lines 
are observed to be relatively parallel to one another; 
thus the six SEL subgroups do not appear to be chang- 
ing differentially in these situations where the raw 
change score indicated little, if any, relationship. 

The criterion dimension of Ambitious Job Atti- 
tudes is of interest because it provides a situation 
where the static relationship with SEL decreases 
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FIGURE 6-2 

SEL VS. AVERAGE STATIC SCORES AT TIMES 1, 2, AND 3 
(A) SELF ESTEEM, (B) NEGATIVE AFFECTIVE STATES, 
AND (C) OCCUPATIONAL ASPIRATIONS 
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across time (as we saw with Social Values earlier) , 
but where the relationship between SEL and overall 
change is considerably smaller than was the case with 
Social Values. Does this suggest that only some of 
the subgroups are shifting across time? 

To answer this question, examine Figure 6-3. 

It may be noted here that there is a largely monoto- 
nic relationship between SEL and the average subgroup 
differences between Time 1 and Time 3 on Ambitious Job 
Attitudes, but that this relationship is a good deal 
stronger for above average SEL groups than for below 
average. Thus, it does appear to be the case that 
about half of the six SEL subgroups are accounting 
for most of the SEL vs. raw change relationship. 



FIGURE 6-3 

SEL VS. AVERAGE STATIC AMBITIOUS JOB ATTITUDES 
SCORES AT TIMES 1, 2, AND 3 
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Previous analyses have documented the fact that 
SEL and various measures of intelligence are strongly 
related in our sample of boys. 3 Therefore, it should 
come as no surprise to learn that the relationships 
between one such intelligence measure, the Quick Test 
(QT) , and the static and raw change scores on the six 
selected criterion dimensions are very similar to 

TABLE 6-2 

RELATIONSHIPS BETWEEN QT AND 
SIX SELECTED CRITERIA ACROSS TIME 1 



Criterion 

Dimension 


Time 1 
(Static) 
Score 


Time 2 
(Static) 
Score 


Time 3 
(Static) 
Score 


Time 1 to 3 
(Raw Change) 
Score 


Self Esteem 


.14 


.13 


.12 


.05 


Negative 

Affective 

States 


.06 


.04 


.06 


.02 


Occupational 

Aspirations 


.33 


.31 


.33 


.02 


Ambitious 

Job 

Attitudes 


.23 


.16 


.16 


.11 


Academic 

Achievement 

Value 


.11 


.07 


.11 


.18 


Social 

Values 


.14 


.08 


.06 


.13 


^Entries 


in this 


table are 


eta coefficients. 



3 

See Bachman, 1970, for several indications and 
discussions of these relationships. 
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FIGURE 6-4 

QT VS. AVERAGE STATIC SCORES FOR SIX CRITERIA 
AT TIMES 1 (•••«), 2 and 3 (— ) 
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those already reported for SEL. Data summarizing 
these relationships (in a fashion similar to the 
treatment afforded SEL) may be found in Table 6-2 and 
Figure 6-4. Because of the great similarity to the 
SEL displays, these data will not be discussed here. 
The interested reader is invited to explore them as 
he wishes. 

From these few analyses, it would appear as if 
the proposed parallel prediction model may be a good 
one for examining differential shifts among subgroups. 
We have seen evidence that even when the criterion 
dimension possesses considerable stability across 
time, the procedure may provide data of considerable 
interest. Furthermore, the relationships with the 
overall raw change scores appear to be useful indi- 
cators of situations in which subgroups are shifting 
differentially across time. 

Application of the Proposed Procedure to Groups of 
Empirically-Defined "Changers" 

In Chapter 4, we examined a technique used by 
two previous researchers (Trent and Medsker, op. cit.) 
to identify subgroups of interest by using the raw 
change scores themselves. The procedure they sug- 
gested for this purpose is based on raw change scores, 
and it results in three subgroups of potential in- 
terest: Exceptional Changers (EC) are those who 

change most positively. Negative Changers (NC) are 
those who change most negatively, and Average Chang- 
ers (AC) are those who change the least in either 
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direction. ^ As we noted in Chapter 4, this kind of 
definition of change does not consider whatever re- 
gression effect may be observed in the data, and 
because of this fact, the utility of the procedure 
seemed to be questionable, at best. 

In order to consider further the utility of this 
type of analysis, the author has defined the three 
types of change groups just described for each of the 
six selected criterion dimensions reported in the 
preceeding section. For each criterion separately, 
average Time 1 and Time 3 scores were then calculated 
for each type of change group. Results of these 
analyses are reported in Table 6-3. By comparing 
the Time 1 and Time 3 subgroup means to the Time 1 
and Time 3 grand means, we may observe the following: 

(1) EC groups always have average scores which, 
at Time 1, are below the (Time 1) grand mean 
and, at Time 3, are above the (Time 3) grand 
mean. 

(2) NC groups always have average scores which, 
at Time 1, are above the (Time 1) grand mean 
and, at Time 3, are below the (Time 3) grand 
mean. 

(3) AC groups always have average scores which 
are very near the grand mean, both for Times 
1 and 3 . 

In short, the lower the initial score (or the higher 
the final score) , the more likely one is to be an 
Exceptional Changer, and the higher the initial score 

See p. 68 for a review of the specific pro- 
cedure, if desired. 
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TABLE 6 >3 

"CHANGE GROUPS" VS. AVERAGE STATIC SCORES 
AT TIMES 1, 2, AND 3 



Criterion 


Exceptional 


Average 


Negative 


Grand 


Dimension 


Changers 


Changers 


Changers 


Mean 




T1 


3.41 


3.80 


4.09 


3.77 


Self Esteem 


T2 


3.88 


3.85 


3.78 


3.84 




T3 


4.17 


3,90 


3.55 


3.88 


no. of cases 


335 


708 


319 


1362 


Negative 


T1 


2.30 


2.57 


2.98 


2.59 


Affective 


T2 


2.69 


2.53 


2.48 


2.56 


States 


T3 


2.94 


2.49 


2.22 


2.53 


no. of cases 


282 


799 


264 


1345 


Occupational 


T1 


41.7 


67.7 


78.9 


65.4 


Aspirations 


T2 


60.8 


65.8 


55.5 


63 . 0 




T3 


72.6 


64.5 


39.4 


61 . 2 


no. of cases 


157 


601 


170 


928 


Ambitious 


T1 


4.56 


5.19 


5.65 


5.15 


Job 


T2 


5.29 


5.32 


5.26 


5.30 


Attitudes 


T3 


5.73 


5.34 


4.81 


5.31 


no. of cases 


272 


794 


269 


1335 


Academic 


T1 


4.49 


5.28 


5.62 


5.19 


Achievement 


T2 


5.03 


5.08 


4.94 


5.04 


Value 


T3 


5.42 


5.04 


4.29 


4.94 


no. of cases 


265 


740 


303 


1308 


Social 


T1 


4.18 


4.77 


5.20 


4.75 


Values 


T2 


4.74 


4.78 


4.81 


4.78 




T3 


4.98 


4.78 


4.50 


4.76 


no. of cases 


248 


808 


284 


1340 
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(or the lower the final score), the more likely one 
is to be a Negative Changer. Thus, the disturbing 
relationships (between type of changer and initial 
and final score) reported earlier in Table 4-5 are 
most assuredly a result of defining change groups in 
this fashion. 

Summary 

The proposed parallel prediction model has been 
applied to subgroup analyses. Plotting the average 
criterion value separately at each point in time for 
each subgroup provides a concise picture of subgroup 
shifts across time. Even when means and standard 
deviations of the criterion were observed to be quite 
stable from sophomore through senior years, interest- 
ing subgroup shifts have been observed. Of parti- 
cular note is the fact that relationships between 
raw change and subgroup level provided consistent 
indicators of differential subgroup shifts. 

Additional analyses of the type described 
earlier aimed at identifying three subgroups using 
raw change scores further documented the confounding 
of change with regression inherent in this definition. 
Those who were defined as having the largest positive 
change were consistently observed to have the lowest 
average initial score and the highest average final 
score. Similarly, those who were defined as having 
the largest negative change wex*e consistently ob- 
served to have the highest average initial score and 
the lowest average final score. The value of ana- 
lyses based on groups defined in this fashion seems 
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doubtful, at best. 

The early identification of subgroups is thus 
seen to have a facilitating effect in longitudinal 
analyses. Examining the pattern of subgroup shifts 
across time may provide interesting and insightful 
looks at the data, even if no aggregate shifts are 
observed. In spite of their well-documented weak- 
nesses, raw change scores may greatly facilitate 
such analytic efforts, provided they are interpreted 
with caution based on a clear realization of their 
potential for bias. 
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Appendix A 



THE APPLICABILITY OF THE 
COLEMAN MODEL FOR ESTIMATING 
TRUE GAIN FROM THREE 
WAVES OF DATA 



The general model given by Coleman is an 
expression for the rate of change ,_1. in a vari- 
able as a linear additive function ''of the vari- 
able itself (X 1 ) and other independent variables 
^ X 2' X 3 '**' X N^ * For *^e sake of simplicity, we 
will examine the situation where there is only one 
independent variable (X 2 ) but the discussion may 
be readily extended to the general case of N 
independent variables. 

This leads to a differential equation as 
follows: 




- a + b i x i 



+ W 



Integration of this expression yields an equation 
of the following form: 




(e fa l 4t -l) 



+ e 



blit 



x 10 + f^<e blAt -l)X 2 . 



Here, the second subscript on the X^ terms indicate 
the source of the measure; the initial time is 
represented by 0 and some later time is represented 



J-Coleman, 1968, pp. 28-478. Formula 11.66 
(p. 456) is incorrect (Cojuaman, personal communica- 
tions, 1969). It should bet 



b le * ln 



(V ) 2 

V 
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by t. 
form: 



Note that this is a linear equation of the 
* * * 

■ v i X 10 + b 2 X 2- 



X„ , = A + b, X, A . Linear regres- 



lt 



sion may be used to estimate the values of the 

coefficients in this equation. If we set 
* b,At b.* 

C = kjAt — = ln b then the coefficients in 

the original differential equation are: 



a 



* * 
a c 



At ' 



b 



1 



* 

lnh^ 
“At ' 



and 



b 2 c 

b 2 = “At * 



Coleman next discussed the effect of measure- 
ment error in on the estimates. Rather than 
introducing the mathematics here, the more 
important thing is to understand that we are try- 
ing to partition observed or raw change into a 
component reflecting true change and a second 
component due to measurement error. This parti- 
tioning is extremely important to avoid the 
situation in which “...measurement error is 
masquerading as change" (Coleman, 1968, p. 453) 
by causing changes in the observed scores over 
time when there is no true change taking place. 

To this end data from the third observation are 
brought into the picture. Now, if only measure- 
ment error is causing the observed change in the 
dependent variable then the relation between the 
Time 1 and Time 2 scores ought to be the same as 
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the relation between the Time 1 and Time 3 scores 
(or, for that matter, between the Time 2 and Time 3 
scores). But if there is no measurement error, the 
relation between the Time 1 and Time 3 scores 
should be less than between the Time 1 and Time 2 
(or Time 2 and Time 3) scores, because the greater 
length of time has allowed more change to occur. 
Coleman proceeds to extend the model in such a way 
as to permit estimating the relative importance 
of these two components (Coleman, 1968, pp. 453- 
456) . 

The approach appears well suited to our 
purposes? it encompasses both the issues of 
unreliability and the simultaneous use of the 
data from all three observation periods. However, 
perhaps the most critical assumption underlying 
the mod si is subject to question in our study; 
that is, it is difficult to imagine that the rate 
of change through the interval spanned by the 
study is a constant for many of our dependent 
variables. Because we feel very uncomfortable 
about making this assumption for many of our 
variables, the utility of the Coleman model is at 
best limited. 
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RELATIONSHIP BETWEEN 
INITIAL SCORE AND 
RAW GAIN SCORE 



Following are calculations showing that, in 
the typical case where initial and final scores 
have approximately equal variances, raw gain 
scores will show negative correlations with initial 
scores . 

Let A= the initial raw scores, and 
let B= the final raw score, and 
let G= the raw gain score 88 B-A. 

Now the correlation between the initial and gain 
scores may be represented by the formula: 1 



r AB°B “ °A. 



AG 



We 



( 1 ) 
2 



can express in terms of A and B as follows : 



2 2 ,,»22 , , 4 2 2 2 2 

°G = °B-A = ^ °B + ^"" 1 ^ a A a B + a A * 

Hence a Q = ^o fi 2 + <? A 2 . (2) 

(Note: c Q « c gf2 when o fi = a A «) Now we want 
to examine the value of r^, in the situation 
where the usual case. Substi- 

tuting o A for a 0 in equations (1) and (2) 

iShaycoft, 1967, p. 3-12. 

2 

Hays, 1963, p. 236. This formula assumes 
that A and B are independent. 
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we get: 3 r = — AB A — 





a A W2 




Note that the numerator of this coefficient will 
always be negative, except when r. n = 1 in which 

no 

case the ratio assumes the value zero. Thus, in 
the usual case where the variances of the initial 
and final scores are approximately equal, there 
will be a negative correlation between the initial 
score and raw gain score. 

~ ^Shaycoft ' s formula (2) on p. 3-12 appears 
to be in error in this regard. 
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COMPARISONS OF ADDITIVE 
AND COMPOSITE PREDICTIONS 
OF NINE TIME 3 CRITERION SCORES 
FROM TIME 1 AND TIME 2 SCORES 
ON THE SAME DIMENSION 



Following are tables summarizing the additive 
and composite predictions of the Time 3 measure 
from the Time 1 and Time 2 measures for nine cri- 
terion dimensions. The entries in these tables 
follow the format used in Tables 4-2 and 4-3 in 
the text. Namely, the frequency, mean, and 
standard deviation of Lhe Time 3 score are given 
for each of the nine cells. The marginal means 
and frequencies are given for each value of the 
Time 1 and Time 2 trichotomies. For the table as 
a whole, the overall Time 3 mean, standard devia- 
tion, and total frequency is given in the cell in 
the lower right corner. Finally, the adjusted 
multiple correlation coefficient (from the addi- 
tive MCA prediction of the Time 3 from the Time 1 
and Time 2 scores) and the eta coefficient (from 
the analysis of variance prediction of the Time 3 
from the Time 1 by Tima 2 composite variable) are 
given under the standard deviation in the lower 
right corner cell. 
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SELF-ESTEEM: 

ADDITIVE VS. COMPOSITE PREDICTION 





TIME 2 


TOTAL 


1 


2 


3 


T 

I 

M 

E 

1 


1 


n=I63 

x=335 

sd=45 


n-179 

x=373 

sd*=41 


n=20 

x=419 

sd=45 


N=*362 

X=358 


2 


n=119 

x=352 

sd=40 


n=485 

x=392 

sd=39 


n=133 

x=426 

Sd=37 


N=737 

X=391 


3 


n=*8 

x=376 

sd=37 


n=120 

x=409 

sd=37 


n=122 

x=440 

sd=33 


N=250 

X=423 


TOTAL 


N— 290 
X-343 


N=784 

X=390 


N=275 

X=432 


N=1349 

X=388 

SD=49 

R=.613 

n=.6!6 
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NEGATIVE AFFECTIVE STATES: 
ADDITIVE VS. COMPOSITE PREDICTION 



1 



2 



1 



3 



TOTAL 



TIME 2 



1 



2 



3 



n=141 

x=201 

sd=46 



n=100 

x=239 

sd=>40 



n=9 

x=281 

sd=48 



n=144 

x=211 

sd=36 



n=533 

x=251 

sd=39 



n=110 

x=290 

sd-44 



TOTAL 

N=250 

X=219 



N=787 

X=249 



n=15 
x— 229 
sd=55 



n=127 

x=270 

sd=39 



n=150 

x=322 

sd=44 



N=292 

X=295 



N=30 0 N=760 

X=207 X=253 



N=269 

X=308 



N=1329 

X=254 

SD=53 

R=.644 

n*=.647 
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SOCIAL VALUES: 

ADDITIVE VS. COMPOSITE PREDICTION 









TIME 2 








1 


2 


3 


TOTAL 






n=150 


n=143 


n=26 


N=319 




1 


x-436 


x=463 


x=4 94 


X=453 






sd=41 


sd=40 


sd=50 




T 

I 




n=97 


n=440 


n=130 


N=667 


M 

T? 


2 


x=446 


x=471 


x=506 


X=4 74 


Jfci 




sd=38 


sd=33 


sd=37 




1 




n=30 


n=135 


n=168 


N=333 




3 


x=468 


x«486 


x=520 


X=502 






sd=68 


sd=32 


sd=36 




TOTAL 


N=277 


N=718 


N=324 


N=1319 






X— 443 


X=472 


X-512 


X=476 

SD=63 

R-.557 

n=.558 




113 



P 



APPENDIX C 



103 



AMBITIOUS JOB ATTITUDES: 
ADDITIVE VS. COMPOSITE PREDICTION 





TIME 2 


TOTAL 


1 


2 


3 


T 

I 

M 

E 

1 


1 


n=138 

x=478 

sd=55 


n=202 

x=509 

sd=59 


n=30 

x=571 

sd-58 


N=370 

X=503 


2 


n=113 

x=490 

sd=57 


n=413 

x=534 

sd=51 


n=173 

x=567 

sd=49 


11=699 

X=535 


3 


n=10 

x=510 

sd=69 


n=114 

x=543 

sd=58 


n=132 

x=581 

sd=53 


N=256 

X=562 


TOTAL 


N=261 

X=484 


N-729 

X-529 


N=335 

X=573 


N=1325 
X=531 
SD=62 
R=. 499 
n=. 504 
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ASPIRED OCCUPATION: 
ADDITIVE VS. COMPOSITE PREDICTION 





TIME 2 


TOTAL 


1 


2 


3 


T 

I 

M 

E 

1 


1 


n=86 

x=29 

sd=l9 


n=34 

x=49 

sd=27 


n=7 

x=47 

sd=21 


N«127 

X=35 


2 


n=33 

x«=38 

sd=22 


r=304 

x=62 

sd=18 


n-65 

x=76 

sd=16 


N=402 

X=63 


3 


n=12 

x=52 

sd=28 


n=105 

x=62 

sd=20 


n=158 
x— 81 
sd=14 


N=275 

X~73 


TOTAL 


N*131 

X=33 


N=443 

X=61 


N=230 

X=79 


N=8 04 
X=€2 
SD-24 
R=. 642 
ri=.652 
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TOTAL DELINQUENCY: 

ADDITIVE VS. COMPOSITE PREDICTION 







TIME 2 








1 


2 


3 


TOTAL 






n=186 


n=113 


n=8 


N=307 




1 


x-119 


x=142 


x=189 


X=129 






sd=15 


sd=29 


sd=61 




T 

I 




n=141 


n-536 


n=91 


N=768 


M 


2 


x=129 


x=157 


x=196 


X=156 


E 




sd=20 


sd=32 


sd=46 




1 




n=ll 


n«121 


n=120 


N=252 




3 


x— 143 


x=175 


x=207 


X=189 






sd=23 


8^37 


sd=38 




TOTAL 


N=338 

X=124 


N=770 

X=158 


N=219 

X=202 


N=1327 

X=156 

SD=41 

R=.638 

0-.641 
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ACADEMIC ACHIEVEMENT VALUE: 
ADDITIVE VS. COMPOSITE PREDICTION 









TIME 2 










1 


2 


3 


TOTAL 






n=121 


n=91 


n=22 


N=234 




1 


x=434 

sd=68 


x=488 

sd=55 


x=525 

sd=79 


X=4 64 


T 

I 




n=162 


n=340 


n=114 


N=616 


M 

E 


2 


x=461 

sd=67 


x=494 

sd=59 


x«530 

sd=60 


X=492 


1 




n=56 


n=193 


n=167 


N=416 




3 


x=473 

sd=76 


x=503 

sd=60 


x=542 

sd=52 


X=515 


TOTAL 


N=339 

X=453 


N=624 

X=496 


N=303 

X«536 


N=1266 

X=494 

SD=69 

R=.447 

n=.453 
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INTERNAL CONTROL: 

ADDITIVE VS. COMPOSITE PREDICTION 







TIME 2 








JL 


2 


3 


TOTAL 






n=118 


n»180 


n~26 


N=324 




1 


x=151 


x=165 


x=182 


X=162 






sd=22 


sd=18 


sd=16 




T 

I 




n=142 


n=470 


n=183 


N=795 


M 

TTI 


2 


x=156 


x=171 


x=185 


X=171 


tj 




sd=19 


sd=18 


sd=15 




1 




n=14 


n=93 


n«107 


N=214 




3 


x=170 


x=174 


x=189 


X--I82 






sd-18 


sd=21 


sd=14 




TOTAL 


N=274 


N=743 


N=316 


N=1333 






X=155 


X=170 


X=186 


X=171 

SD=21 

R=.511 

n=.5l5 
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NEGATIVE SCHOOL ATTITUDES ♦ 
ADDITIVE VS. COMPOSITE PREDICTION 







TIME 2 








1 




3 


TOTAL 






n~194 


n=168 


n=24 


N=386 




1 


x=150 


x=173 


x=234 


X=165 






sd=38 


sd=42 


sd=61 




T 

I 




11=166 


n=386 


n=119 


N=671 


M 


2 


x=162 


x=19Q 


x=226 


X=189 


E 




sd=43 


sd=45 


sd=56 




1 




n=23 


n=135 


n=107 


N=265 




3 


x=175 


x=211 


x=251 


X=224 






sd=43 


sd=46 


sd=57 




TOTAL 


N=383 

X=156 


N=689 

X=190 


N=250 

X=237 


N=1322 
X=189 
SD=54 
R=. 538 
n=. 543 
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The notation "t" indicates a table or 
figure in which the index entry appears . 

Academic achievement, 28t, 33 t 
Academic achievement, value of, 62t, 

78, 79t , 80t , 81, 84t, 85t, 88t 
Ambitious job attitudes, 28t, 30, 

33t, 36t, 39t, 62t, 79t, 81, 

82t, 83t, 84t , 85t, 88t 
Andrews, F. , preface, 41, 77 
Ar scott, A. , preface 
Bachman , J. , preface, 1, 9, 25, 27, 

40, 41, 63, 77, 78, 84 
Bereiter, C., 14, 16, 20 
Bereiter procedure, 20-23 
Bingham, J. , preface 
Bohrnstedt, G. , 35 
Bozoki, L., preface 
Bumpass, J. , preface 
Cave, W. , preface, 77 
Change 

average , 5-6 
individual, 5-6 
intervals for assessing, 4t 
meaning of, 5-6 
Change scores 

adjusted gain, 6, 13-26 
areas of agreement. 13-17 
independent gain, 41, 42 t, 43 
predictability of, 40-48 
reliability of, 16, 39t 
strengths and weaknesses of, 13-26 
true residual gain, 24, 45 
uses of as dependent variables, 17 
uses of as indications of overall 
shift , 7-8 

uses of as theoretical constructs, 

18 

uses of for selecting exceptional 
individuals, 17 
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uses of to arrange individuals on 
a continuum, 7-8 
uses of to estimate individual 
change, 18, 23 

uses of to examine aggregate 
shifts, 18 

uses of to identify unstable 
criteria, 7-8 

Change scores, raw (difference) , 

6, 13-26, 41, 43 
definition of, 13 
how to improve on, 17-25 
negatively correlated with initial 
score, 13-14 

positively correlated with final 
score, 14 

prediction of, 42t 
related to initial scores, 97, 98 
related to true gain scores, 48 
reliability of, (see Reliability, 
of difference scores and Relia- 
bility, of regressed gain scores) 
use of to identify subgroup shifts, 
77-86 

Coleman, J. , 6, 13, 93, 94, 95 
Coleman model for estimating "true 
gain," 93-95 

College plans, 28t, 33t, 66, 67t 
Collet, L. , preface, 77 
Composite score (see Trends, use of 
non-additive model to detect) 
Computer Services Facility, preface 
Computer Support Group of the Survey 
Research Center, preface 
Converse, P., 31 
Cope, R. , preface 

Cronbach-Alpha coefficient, 35, 3€t, 
39t 

Cronbach-Furby procedure, 23-25 
Cronbach, L., 6, 13. 20, 23, 24, 35, 
74 

Davidson, D., preface 
Davidson, T., preface, 1, 27 
Deasy, P., preface 
Delinquent behaviors, 28t, 33t, 62 t 
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Eta coefficient, 42t, 57, 79t 

Flanders, N. , preface 

French, J., Jr., preface 

Furby, L. , 6, 13, 20, 23, 24, 74 

Gain Scores (see Change scores) 

Gerstman, R. , preface 

Glaser, B. , preface 

Green, S., preface, 63 

Happiness, 28t, 33t 

Harris, C. , 6, 13 

Hays, W. , 44, 97 

Heise, D., 16 

Holt, P., preface 

Iman, S., preface 

Independent gain scores (see Change 
scores, independent gain) 
Institute for Social Research, 
preface 
Intelligence 

as measured by Quick Test (see 
Quick Test) 

Internal consistency (see also Reli- 
ability and Change scores) 
of criteria across time, 78 
of raw gain scores, 15-17 
of static scores, 15-17, 19, 34-37 
related to reliability of change 
scores, 34, 37-38, 39t, 40 
related to stability, 34 
Internal control, 28t, 30, 33t, 

35, 36t , 39t, 62t 
Jacobs, M. , preface 
Job information test, 28t, 30, 33t, 
36t, 39t , 57, 59t 
Johnston, J., preface 
Johnston, J., Jr., preface 
John ton, L., preface, 1, 27 
Kahn, R. , preface, 1, 27 
Knapp, D., preface 
Kuder-Richardson formula 20, 36t 
Lamendella, R., preface 
Long, J. , preface 
Lord, F. , 18, 45 

Lord procedure for estimating true 
gain, 18-20 
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Mednick, M. , preface, 1, 27 
Medsker, L. , 68, 69, 86 
Multiple classification analysis (MCA) , 
41, 56 

Multiple correlation coefficient (R) , 

57 

Multiple regression models ( see Re- 
gression models, multiple) 

Navarro, H., preface 
Need for self-development , 28t, 33t 
Need for self-utilization, 28t, 33t 
Negative affective states, 28t, 33t, 

62t , 78, 79t , 82t , 84t, 85t, 88k 
Negative school attitudes, 28t, 33t, 

62t 

Niaki, R. , preface 

Norstebo, G. , preface 

Number of siblings, 42t 

Nunnally, J., 35, 36t 

Occupational aspirations, status of, 

28t , 33t , 40, 42t , 43, 62t, 66, 

67t , 78, 79t, 82t , 84t, 85t, 88t 
O'Malley, P. , preface 
Paige, K. , preface 
"Parallel prediction" strategy 

applied to selected criteria, 79t 
as a way to use more than two 
"waves" of -iata, 10-12 
as an alternative to change scores, 
8-10, 11-12 
summary of 75-76 
Plotkin, J. , preface 
Positive school attitudes, 28t, 30, 

33t , 35, 36t , 39t , 40, 60, 61t 
Project TALENT, 17 
Quick test (QT) , 42t, 77, 84t, 85t 
Race, 42t 

Rappaport, P. , preface 
Rattenbury, J., preface 
Raynor, J., preface 
Regression effect ( see also Change 
scores, adjusted gain and Change 
scores, residual gain) 
confounded with definition of 
"changer" groups, 69-70 
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Regression models, multiple, 75 
Reliability (see also Internal con- 
sistency and Stability coefficient) 
of independent gain scores, 37, 

39t 

of raw difference scores, 37, 39t, 

78 

of regressed gain scores, 38, 39t, 

78 

"split-half," 16 
test-retest , " 16 
Research design 
overview of, 2t 

Residualized (true) gain score (see 
Change scores, adjusted gain or 
Change scores, true residual gain) 
Rodgers, W. , preface 
School environment 

as agent of conformity, 29 
as agent of non-conformity, 29 
influencing change, 27 
Self-esteem, 28t, 30, 33t, 36t, 39t, 
62t, 78, 79t , 82t , 84t, 85t, 88t 
Self -development, need for (see Need 
for self-development) 
Self-utilization, need for (see Need 
for self-utilization) 

Shaycoft, M. , 17, 97, 98 
Social values, 28t, 33t, 62t, 78, 

79t, 80t , 81, 84t , 85t , 88t 
Socioeconomic level (SEL) , 42t, 

77, 78, 79, 80t , 81, 82t, 83t, 

84 

Somatic symptoms, 28t, 33t 
Sonquist, J. , preface 
Stability 

in YIT criteria, 27-34 
related to internal consistency, 

related to measurement interval, 

32 

related to reliability of change 
scores, 34, 37-38, 39t, 40 
Stability coefficient, 15-17, 30, 

32, 33t 
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Status of aspired occupation ( see 

Occupational aspirations, status 
of) 

Strauss, A. , preface 
Subgroup analysis, 63-70 ( see also 
Trends) 

"counterbalancing” in, 64 
identification of "average" changers 
for, 68, 69t , 86-89 
identification of "changers" for, 

68-70 

identification of "exceptional" 
changers for, 68, 69t, 86-89 
identification of "negative" 
changers for, 68, 69t, 86-89 
Taylor, C., preface 
Thomas, B., preface 
Thomas, D., preface 
Trends (see also Change scores, 

uses of to examine aggregate shifts) , 
51-63, 76 

definition of, 51-55 

in job information test, 52t 

use of additive model to detect, 

55, 99-108 

use of non-additive model to de- 
tect, 55-57 
Trent, J., 68, 69, 86 
True scores, 18, 21, 31 

variance in accounted for by ob- 
served scores, 35 

Trust in government, 28t, 30, 33t, 

36 t, 39t 

Trust in people, 28t 
Tucker, L. , 38 
Van Duinen, E., preface 
Veerkamp, P., preface 
Wirtanen, I., preface, 63 
Youth in Transition study 
description of, preface 
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